It is a pretty common scenario these days to create an application using a general purpose programming language (like java or Dart). And then sprinkle in more specialized languages like sql or html as needed.
Martin Fowler uses the terms host language and domain specific language (or DSL). He wrote a whole book on this topic.
DSLs are a fascinating topic, and much has been written on the subject. But what interests me is the question of how we integrate all of these DSLs into our application. That's was this post is about.
DSLs are a fascinating topic, and much has been written on the subject. But what interests me is the question of how we integrate all of these DSLs into our application. That's was this post is about.
DSL as String Literal
Probably the most common way to embed a DSLs into a Java app is with string literals, where the stuff inside the quotes is the DSL. Here are a few examples:
sql
|
String q = "SELECT id,firstName from person";
|
html
|
String h = "<div>Hello</div>");
|
regex
|
Pattern.compile("[dD]ave");
|
jpa
|
String q = "SELECT p.id,p.firstName from Person p";
|
css in html
|
String h = "<div style='color:red'>Hello</div>");
|
The problem with putting a language inside of a string is that the IDE (or other tools) cannot help you very much, particularly in terms of:
- catching your errors
- refactoring
- auto-completion
- syntax highlighting
- debugging
Now, if your host language has no types (like JavaScript) or you don't use an IDE then you are probably accustomed to not getting help in these areas.
But my primary host language does have types (Java) and I do use an IDE (Intellij). So I aways try to code in a way that maximizes the IDEs ability to help me. But for string literal DSLs this is difficult.
DSL as API
Another solution is to replace the string literal DSL with an api. This is what jooq does:
This addresses some of the problems mentioned above with "DSL in String Literal". But it is usually more verbose.
APIs that look like a DSL
In some languages (particularly Ruby) there is great latitude in the way api's can be defined and called. So much so that a Ruby API can very much look like a a DSL. And its not in a string literal. It's actually part of the language. And the distinction between API and DSL starts to become blurry. Martin Fowler calls these internal DSLs. Here is an sql select statement in a Ruby internal DSL:
statement = Select[:id, :firstName, :age].from[:people].where do
equal :lastName, 'Ford'
greater_than :age, 21
end
An Internal DSL is really just an API that looks like a DSL.
IDEs and DSL String Literals
Some modern IDEs are smart enough to figure out that a string literal is actually a DSL - and provide some extra help. This is a really cool feature! Below are screen shots of Intellij's awesomeness with string literal DSLs:
sql
| |
jpa
| |
regex
| |
css/html
|
I can't overstate how useful this functionality is. It is so useful that, in my opinion, this should be a major consideration in any new general purpose programming (GPL) language. That is: how well does the GPL (and it's tooling) deal with language-in-language.
Multi-line String Literals
Many languages allow string literals to span multiple lines. For example, in Dart you can use the triple quote:
String html = '''
<table>
<tr><td>First Name</td></tr>
<tr><td>Last Name</td></tr>
</tr>
</table>
'''
String Templates/Interpolation
Most languages that support multi-line string literals also support string interpolation. For example, in the Dart multi-line string example above, you can embed a Dart expression inside the HTML using the ${ } syntax:
Person p = getPersonFromSomewhere();
String html = '''
<table>
<tr><td>First Name</td><td>${p.firstName}</td></tr>
<tr><td>Last Name</td><td>${p.larstName}</td></tr>
</tr>
</table>
'''
In this case, what we really have is a String template.
DSL Inversion
Another solution to this problem is to flip it. For example, instead of embedding html in java, embed java in html. This is what JSP is all about. HTML is the host language and java is the embedded language.
DSL Literals
Java Literal
In JSP, java is not embedded like this:
'''
int x = 4;
int y = 2;
int z = x + y;
'''
Rather, JSP supports a java literal:
Rather, JSP supports a java literal:
<%
int x = 4;
int y = 2;
int z = x + y;
%>
Reg Ex Literal
JavaScript supports the regular expression literal:
var r = /[dD]ave/;
Extensible Literal Types
Given the prevalence of language-in-language, I think any new language should have some mechanism for supporting this. I propose a modifier to the triple quote syntax that adds some suggestion (to tooling and readers) as to the type of string contained:
Sql q = sql'''
SELECT id,firstName from person where id = 7 ''' |
Html h = html'''<div>Hello</div>'''
|
Regex r = regex'''[dD]ave [fF]ord'''
|
Jpa q = jpa'''
SELECT p.id,p.firstName
from Person p
where p.age > 21
'''
|
Html h = html'''<div style='color:red'>Hello</div>'''
|
Templating
I want the above mentioned extensible literals. But i also want to use embedded ${} expressions:
Sql q = sql'''
SELECT id,firstName
from person
where id=${p.id}'''
|
Html h = html'''<div>Hello ${msg}</div>'''
|
Regex r = regex'''[dD]ave ${lastName}'''
|
Jpa q = jpa'''
SELECT p.id,p.firstName
from Person p
where p.age > ${p.id}
'''
|
Html h = html'''<div style='color:${color}'>${msg}</div>'''
|
This creates some extra complexities. For example, should the string be parsable before the embedded Dart expressions are evaluated? I think so. Otherwise, IDEs and tools would not be able to help much, defeating the whole point. Therefore, there must be some restrictions placed on where in the string, ${} expressions are allowed. And this would be different for each DSL.
For example, the following would not be a valid template, even though it evaluates to a valid SQL statement:
var s = 'ECT';
var q = sql'''
SEL${S} id,firstName
from person
Bottom Line
A few things are certain:
- Language-in-language will always be needed, especially on the web. Whether is called templates or DSL or polyglot programming. Its here to stay.
- We need something more than just a multi-line string with interpolation for these embedded DSL's. We want our tools to help us find errors early, refactor, auto-completion, syntax highlighting, debugging. We want self documenting code.
The two possible solutions i can think of are:
- Extensible literals as proposed above
- Internal DSL's
I prefer the extensible literal approach, primarily based on how awesome this is in IntelliJ with Java. And that's without any special language support at all.