NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Google Common Lisp style guide (google.github.io)
kazinator 1342 days ago [-]
> You must not use INTERN or UNINTERN at runtime.

I.e. you must not read Lisp data at run time, if it contains symbols, because that will call intern.

> You should avoid using a list as anything besides a container of elements of like type.

Good-bye, code-is-data.

I could reduce this guide by a good 30% with "You should avoid using Lisp as anything as Go or Java".

But that could be seen as defining a macro, which you must seldom do.

rvense 1342 days ago [-]
What it says is to not abuse lists, isn't it? I think what they mean is something like, don't use a list as a "object"/tuple that gets picked apart with a lot of cdddaring? Rather, use real structures and other containers as appropriate, keeping in mind the performance characteristics. It specifically says lists are appropriate for macros and functions used by macros at compile-time.
reikonomusha 1342 days ago [-]
This is exactly what they mean.
kazinator 1342 days ago [-]
I think it's perfectly fine to use a nested list object that gets picked apart with destructuring until the point that it becomes a maintenance or performance problem. (That point could arrive later that same day, or it might come never).

This approach is one of the things that make Lisp Lisp; if it gives you an allergic reaction, use something else.

dreamcompiler 1342 days ago [-]
I have seen things you people wouldn't believe. 500-line 7-deep plists off the shoulder of Orion. I watched CLOS glitter in the dark, unused, near the Tannhäuser Gate. 28-clause typecase statements because the author didn't understand generic functions. All those moments will be lost in time, like tears in rain.

And thank dog for that.

kazinator 1342 days ago [-]
Such things are quantifiable. We can have an exact rule which says that you can use an ad-hoc list as a structure if (for instance):

1. No more than 17 functions handle this datum, spread among no more than three source files.

2. The structure contains no more than 8 conses.

3. In a long-running application under a typical production load, no more than 10,000 of these objects are freshly allocated in any five minute period.

4. A major software component (such as a library) can internally have at most three separate instances of such a data type, and they are not to be involved in the APIs between major software components.

Okay, that's now a target we can enforce without wishy washy judgment calls.

dragonwriter 1342 days ago [-]
> Okay, that's now a target we can enforce without wishy washy judgment calls.

Sure, you can have an exact set of rules like that, and feel free to have an automated enforcement of your own set of exact rules. There are good reasons coding style guides often include things which are not exact rules, and the target for them is often not automated enforcement but supporting human judgement that balances multiple factors. Yes, that results in fuzzy boundaries, but it's because experience has both shown that there is an issue but has not provided (yet) sufficient basis for a quantifiable boundary, because the set of factors being balanced is complex and multidimensional. Reducing the dimensionality for simplicity of automated enforcement is easier, but not necessarily better.

joshuamorton 1342 days ago [-]
> Such things are quantifiable. We can have an exact rule which says that you can use an ad-hoc list as a structure if (for instance):

So what happens when you have a datum that is used in exactly 17 functions, but you need to add another feature?

Or, what happens when people combine functions into larger ones to avoid having to define a real datatype, or...

These concerns exist for each thing.

"Don't use a list where you really have a struct" is much more concrete and quantifiable.

kazinator 1342 days ago [-]
> but you need to add another feature?

You have to add your feature in such a way that the resulting code meets the rules.

It's exactly the same like when you need to add something to a line that is already 79 characters long (maximum allowed by your coding convention). Or if you have to add lines of code into a function that is already 200 lines long (coding style max). Or if you need another argument in some API that already has the maximum of 8. You have to step back to some extent and change more of the surrounding program, than just shoving your intended change into it.

> Or, what happens when people combine functions into larger ones to avoid having to define a real datatype

If the code remains under the quantified limit for function size, then it is complying with the document.

> "Don't use a list where you really have a struct" is much more concrete and quantifiable.

There are two answers to "where do you really have a struct?" One is purely opinionated, and one is objective. The objective answer is: "you really have a struct in all situations where the list isn't a variable length container of items of the same type".

So the concrete and quantifiable (and therefore the right) interpretation of the rule amounts to never using a fundamentally characteristic Lisp technique, in Lisp!

TeMPOraL 1342 days ago [-]
I once won a huge performance win in a commercial project by discovering that some numerically-heavy computations on large vectors of numbers were actually using lists to keep those numbers. I replaced those universally with arrays, yielding some ridiculous efficiency benefit. The lists were fine when there were 5 numbers in the bag, but nobody noticed when the amount of numbers grew to thousands...
wrycoder 1342 days ago [-]
I believe Alan Perlis said that Lisp programmers know the value of everything and the cost of nothing.
AndyMcConachie 1342 days ago [-]
I love this post.
reikonomusha 1342 days ago [-]
The purpose of code guidelines is to do exactly that: guide developers into a certain practice that’s consistent and broadly agreeable in a team setting. Using lists to fake objects isn’t good modern style, and it’s a reasonable guideline to suggest not doing that. I’m sure Googlers won’t gripe if you temporarily cons up a list to shuttle it around. I’m sure they will gripe if you’re doing SICP 101 style object definitions exclusively with lists. The latter easily gets out of hand in a language renowned for its several robust facilities for object definitions.

MAXIMA (née MACSYMA) is known for lists-as-objects not only being depended upon, but also getting so out of hand it’s nearly impossible to refactor now. MAXIMA at least has the excuse of being very old software.

I feel you’re interpreting this style guide as some kind of draconian law of writing Lisp code at Google, and making awful conclusions from it (e.g., “well if you’re allergic to objects as lists might as well not use Lisp” or “most of this can be summed up as writing Lisp like Java”).

kazinator 1342 days ago [-]
In my experience, these rules end up deployed in such a way that you cannot commit any change until they are obeyed to the letter. (That goes doubly so if they get encoded into an automated tool.)

I don't disagree with that; a coding standard that is not enforceable without generous judgment calls is far less useful than a rigorous one.

Make it exact, and then get everyone to stick to it.

aidenn0 1342 days ago [-]
I have the exact opposite experience; my job with the favorite coding standard explicitly had ways of allowing variances. The CEO came down hard on anyone who said that "The standard forbids X" and would tell them to reread the standard because it actually said something like "Don't do X without doing Y first."

The basic mentality was if you couldn't responsibly follow the style guidelines after working there for a year, then you should be looking for work elsewhere.

monadic2 1342 days ago [-]
For an alternative point of view, I write lisp for my day job and would not approve any code that just shoved a bunch of unrelated data in a list. Thinking about the destructuring code, especially if you don’t have access to pattern matching, is enough to give me a headache. You know that‘s not going to be documented, and if it is, people won’t be able to find the documentation. Just use records; every lisp has records, and it’s the point of records, and it’s not any less “elegant” or “beautiful” or “lispy” or whatever to have named fields and nice printing and accessors for free.

The one exception I can think of is procedures with variadic arguments.

kazinator 1342 days ago [-]
Why don't you have access to pattern matching in your Lisp day job?

(There is at least a coffee machine, health benefits, and a halfway ergonomic chair to sit on, hopefully.)

monadic2 1342 days ago [-]
> Why don't you have access to pattern matching in your Lisp day job?

The lisp I use (scheme) does not have pattern-matching built-in and nobody has made a good enough case to add it as a special form.

oalae5niMiel7qu 1342 days ago [-]
> scheme

There's your problem. Here's a nickel (toss). Get yourself a real Lisp.

divs1210 1342 days ago [-]
> you must not read Lisp data at run time, if it contains symbols, because that will call intern

or use a non-interning reader. Clojure does exactly that - instead of clojure.core/read, you use clojure.edn/read to read data without running it as code

kazinator 1342 days ago [-]
That doesn't work if you want symbols to be symbols, such that if x occurs in two places in the data, it is the same object according to eq.

According to this coding guideline, you cannot develop a .fasl format that is made of Lisp read syntax, or exploit Lisp for sophisticated, structured data formats in general.

ambulancechaser 1342 days ago [-]
can you explain?

    ((juxt type identity) (clojure.edn/read-string "x"))
    [clojure.lang.Symbol x]

It seems that the reader returns symbols just fine.
kazinator 1342 days ago [-]
I don't use Clojure; I have no idea. If two occurrences of "x" are mapped to the same object, that is interning; maybe what that does is use its own package-like namespace, separately allocated for each call.

The documentation for EDN says that "nil, booleans, strings, characters, and symbols are equal to values of the same type with the same edn representation." The only way two symbol values can be equal is if they are actually same symbol, I would hope.

john-shaffer 1342 days ago [-]
> The only way two symbol values can be equal is if they are actually same symbol, I would hope.

Why is this important? Specifically, why do symbols need to be interned?

In Clojure, "Two symbols are equal if they have the same namespace and symbol name." In general, "Clojure’s = is true when comparing immutable values that represent the same value, or when comparing mutable objects that are the identical object." [1]

[1] https://clojure.org/guides/equality

kazinator 1342 days ago [-]
If we read two symbol tokens from a stream, and the lowest-level equality function that is available to us does not distinguish them, then they are interned.

Because symbols are used to refer to things, whether or not they are mutable can be blurry. You can make symbols as immutable as you want, but as soon as you make one of those symbols a key which refers to a mutable object, such as a global environment, then effectively, the symbol appears as a gateway to something mutable, and you can't necessarily tell whether the mutability is in the symbol itself or something beyond it.

For instance, let's consider global variables. The definition of a global variable has an effect which we can inspect if we have a boundp function:

  (boundp 'x) -> nil
  (defvar x)
  (boundp 'x) -> t
That can be made to work by mutating the symbol (the global binding information can be right inside the symbol). Or it could be working by keeping the symbol immutable, but mutating some hash table of bindings.

Either way, the symbol looks interned, because we have mentioned it several times, and those mentions seem to be connected. The (defvar x) has an effect on (boundp 'x) and so they are referring to an x which is somehow the same.

It could work with x actually be a kind of character string, which got separately allocated three times. As long as we can't show any property of the system indicated by x to be different based on which copy of x we are using to enquire (e.g. boundp reports true for one x and false for another), then x looks interned.

lmz 1342 days ago [-]
reitzensteinm 1342 days ago [-]
In Clojure land, equal but not identical? symbols don't cause any issues; they can be used interchangeably as map keys, etc. It won't impact code correctness, just potentially cause slowdowns.

With that said, I always thought symbols would intern, but that's not the case. It is true with keywords, however.

(identical? (clojure.edn/read-string "x") 'x) => false

(= (clojure.edn/read-string "x") 'x) => true

(identical? (clojure.edn/read-string ":x") :x) => true

kazinator 1341 days ago [-]
Objects that can be equal but not identical are not symbols.

They are, at best, cargo culted symbols: character strings with a tag bit which says "read/print me without quotes, so I visually look like something out of Lisp".

reitzensteinm 1340 days ago [-]
You don't use Clojure, but you're willing to jump in and criticize one part of it that, given the context of the rest of the system, could not be less important?

Whether Clojure's object model and equality semantics as a whole make sense is certainly up for debate. It's highly opinionated and no silver bullet.

But once it's in place, the decision of whether to intern symbols is a trivial implementation detail.

I incorrectly assumed they were interned for seven years of using Clojure professionally, it has never made a difference, and I can't come up with a scenario where it plausibly would.

divs1210 1339 days ago [-]
Common Lisp Elitism is a real thing. This person appears to be from the "Clojure is not Lisp" clan of gatekeepers.
lispm 1338 days ago [-]
almost every language which has Lisp in its name is using some form of symbol tables for interning, from McCarthy's Lisp 1 implementation onwards. That's one of the defining features of the Lisp s-expression reader.

If the 'reader' reads an s-expression like (EMACS LISP IS A LISP DIALECT) then both occurrences of LISP are the same identical Lisp object, both are the same symbol.

If your language is doing something different, then it's not using symbols like Lisp-like languages usually do since the dawn of time.

fulafel 1339 days ago [-]
I think the characterisation of Clojure is not that unfair here. It's Clojure keywords have the role that Lisp's symbols have (and I think they have better ergonomics), and symbols are mostly only used for source code representation.

In other Lisps the detailed semantics of symbols are more important including the identity/interning thing.

Rich Hickey was a Common Lisp user before making Clojure so there's a fair chance he knew how symbols worked there, so the cargo culting characterisation should be applied only light heartedly :)

reitzensteinm 1339 days ago [-]
That Clojure is keyword heavy is true, but it's so important to note that objects are essentially never checked for reference equality in Clojure, even when e.g. looking up keys in a hash map (see demo).

With this is mind, you could stop interning keywords and damn near every Clojure program would continue to work just fine - but with a noticeable slowdown.

Or, more sensibly and to bring it back to the theme of the thread, for adding a second non-interning Keyword type which can safely be generated while deserializing user input in a long running process, that you can use interchangeably with standard keywords, but will be garbage collected away with the reset of the deserialized data when you're done.

You do pay a hefty penalty here because you're hiding everything behind interfaces and abstractions. It's totally fine to not like the system, or believe it's not worth the performance hit.

But it does mean that a potentially equal but not identical symbol isn't some off brand low quality replacement as GP suggests, it's just... a symbol.

Pastebin demo: https://pastebin.com/cbWiNyEL

kazinator 1338 days ago [-]
I wouldn't write a config file parsing library for C programs without interning, so == between two pointers could be used to test for keyword equality.

Interning is used outside of LIsp. See the XInternAtom function in the X Window system:

  Atom XInternAtom(
    Display *display,
    char *atom_name,
    Bool only_if_exists
  );
 
or RegisterClass in Win32:

  ATOM RegisterClassA(
    const WNDCLASSA *lpWndClass
  );
reitzensteinm 1338 days ago [-]
I wouldn't either :)
lispm 1338 days ago [-]
For Lisp the following is the usual behavior: Symbols are by default interned and identical symbols are tested with EQ to be T.

  > (eq (read) (read))
  a a 
  T
The default test function is EQL, which is using EQ to test symbols. In Common Lisp #:a would be an uninterned symbol with the name "A".

  > (find 'a '(#:a a))
  A

  > (find 'a '(#:a a) :test #'string-equal)
  #:A
setting the value of a symbol will basically work in all Lisps with symbols in similar fashion like this:

  > (dolist (item '(a b c a))
      (set item (if (and (boundp item)
                         (numberp (eval item)))
                    (1+ (eval item))
                    1)))
  NIL

  > (mapcar 'eval '(a b c a))
  (2 1 1 2)
This last example will for example run unchanged in Emacs Lisp and Common Lisp.
reitzensteinm 1338 days ago [-]
What is the purpose of creating uninterned symbols?
lispm 1338 days ago [-]
They could be used as symbols which can be GCed.

Though a typical use is in macros, where macros introduce new symbols and these should never clash with any existing symbol and to which there should be no access via the name.

Example: A macro which writes the form, the value and which returns the value. GENSYM generates a named/counted uninterned symbol.

  > (defmacro debugit (form &aux (value-symbol (gensym "value")))
      `(let ((,value-symbol ,form))
         (format t "~%The value of ~a is ~a~%" ',form ,value-symbol)
         ,value-symbol))
  DEBUGIT
If we look at the expanded code of an example, we can see uninterned symbols:

  > (pprint (macroexpand-1 '(debugit (sin pi))))

  (LET ((#:|value1093| (SIN PI)))
    (FORMAT T "~%The value of ~a is ~a~%" '(SIN PI) #:|value1093|)
    #:|value1093|)
We can also let the printer show us the identities of these symbols, labelling objects which are used multiple times in an s-expression:

  > (setf *print-circle* t)
  T

  > (pprint (macroexpand-1 '(debugit (sin pi))))

  (LET ((#2=#:|value1095| #1=(SIN PI)))
    (FORMAT T "~%The value of ~a is ~a~%" '#1# #2#)
     #2#)
Thus we can see above that it's just one uninterned symbol used in three places.

Example run:

  > (debugit (sin pi))

  The value of (SIN PI) is 1.2246063538223773D-16
  1.2246063538223773D-16
reitzensteinm 1338 days ago [-]
That seems like a bad idea, since you've now got two symbols with the same name that'll fail eq? Is this ever actually done?

Interesting that gensym returns uninterned symbols, thanks.

lispm 1338 days ago [-]
The uninterned symbols don't fail EQ if they are the same identical symbol.
reitzensteinm 1338 days ago [-]
Great, thanks for filling me in. Any idea why the Google guide is against using them for this purpose?
lispm 1338 days ago [-]
What does the guide say?

Keep in mind that this is a guide from a Lisp using company (bought by Google) who wrote specifically two large applications partly, but significantly, in Lisp: a search engine for flight travel and an airline reservation system. Other application teams may have different rules&requirements, given that they may use Lisp in very different ways.

reitzensteinm 1338 days ago [-]
I'm sorry, my question was based on a four day old memory from the main thread that doesn't reflect what the guide says. It just says don't intern symbols at runtime. Presumably you can pass some kind of a flag to the reader to read lisp data structures without interning the symbols that it reads?

I know the story of ITA well, it and PG's writings are what got me interested in lisp in the first place. Which makes me feel old. But not comp.lang.lisp old, it's all relative!

fulafel 1339 days ago [-]
Good points.

It's interesting that despite this keywords are serialized all the time in Clojure land (eg in the transit format that is commonly used for frontend/backend communication).

reitzensteinm 1339 days ago [-]
I think Google's warning definitely applies to Clojure.

Most json libraries will convert string keys to keywords, and they're not weak references.

An attacker can probably just send a few dozen gigabytes of random json to the average Clojure app and it's going to go down.

1339 days ago [-]
kazinator 1339 days ago [-]
I thought I was one of these gatekeepers; and that was before I found out that Clojure doesn't actually have symbols, but just a string type with a quote-free read syntax.

Even AutoCAD's AutoLisp (the old one from the 1980's) has interned symbols.

How symbols work goes back all the way to the original MacCarthy work, and all of its actual (not cargo-culted) descendants.

It is not "Common Lisp" elitism.

reitzensteinm 1338 days ago [-]
For what it's worth, I agree with you. Clojure only weakly holds on to its lisp heritage. I'd liken it to the similarity between C# and C++ (without implying any superiority to either side). Superficially quite similar, some parts near identical, others a complete detour.

With that in mind, however, basing your critique of Clojure on the extent to which it carries the lisp tradition is bizarre. Your criticisms are born of an ignorance of the value proposition Clojure provides, which would not be terribly different even if it had eschewed lisp syntax in favor of something else.

If you actually learned Clojure, there is zero chance you'd be complaining about symbol interning. It's just so ridiculous. You'd probably still think the whole thing is a waste of time, and I'm sure you'd have a big long list of actual, meaningful complaints.

I've seen people criticize TXR for its ugly syntax once or twice here on HN (I pay close attention to lisp posts here), and I thought that was dumb at the time. I'm not interested in learning it but I'm glad you're trying something new. It's a shame to see you stoop to the same level of drive by dismissal.

But whatever. Let's flip each other's bozo bits and move on.

kazinator 1337 days ago [-]
Speaking of TXR, and of "holding on weakly", as a result of this discussion, I made a little change.

A remark was made somewhere that interned symbols are held with a non-weak reference. But it occurred to me that this isn't something engraved in stone. A package should be able to hold on to its interned symbols via weak references. This means that if the only reference to a symbol is from within a weak package, that symbol can be removed from the package and relclaimed by the garbage collector.

Since a package uses hash tables, and hash tables support weak keys, it's trivial to put the two together. I added an argument to make-package to specify a weak package.

In the following test, the symbols interned into package foo get reclaimed because it is weak. Those interned into bar don't get reclaimed:

  (defun weak-package-test (name weak)
    (let ((wp (make-package name weak)))
      (let ((*package* wp))
        (let ((obj (read "(a b c d e f)")))
          (mapcar (op finalize @1 prinl) obj)))))

  (weak-package-test "foo" t)
  (sys:gc t)

  (weak-package-test "bar" nil)
  (sys:gc t)

  $ ./txr weak-package.tl 
  foo:a
  foo:b
  foo:c
  foo:d
  foo:e
  foo:f
I'm committing to this as a documented feature.
kazinator 1337 days ago [-]
I mentioned AutoLISP to make it clear that the comment is completely unrelated to value proposition, but in hindsight, that may be too obscure.
reitzensteinm 1337 days ago [-]
You call Clojure a cargo cult descendant, which implies it's failing to achieve what its ancestors did, by copying features haphazardly without understanding the deeper motivations behind them.

Clojure is a shitty substitute for CL, but that's just not what it is trying to be. It is an interesting and worthy system in its own right.

kazinator 1337 days ago [-]
No it doesn't imply that; you can easily succeed while copying features haphazardly and without understanding the deeper motivations behind them.
Spivak 1342 days ago [-]
I really don't think that this is bad advice in general. Mess with the code all you want at compile time, but don't touch it at runtime is the good kind of boring. CL is an extremely powerful language, it doesn't mean you should be using it all the time in your day-to-day work.
kazinator 1342 days ago [-]
The thing is, you usually shouldn't be calling INTERN even at compile time. A better rule is this:

Don't use code to calculate character strings, which are then converted to symbols via INTERN. The main exceptions to this rule are structs (which generate slot-reader functions by combining the structure name and slot names).

Macrology which calculates names using code, which are then supposed to be explicitly referenced in code, is pretty stinky.

E.g.:

  (define-blob foo ...)
Here, you're suposed to Just Know that the above is referenced as blob-foo and not foo, because internally it catenates "BLOB-" onto (symbol-name 'foo), and calls intern on that.
dreamcompiler 1342 days ago [-]
Above is one of the many reasons I prefer defclass to defstruct. Defclass doesn't do this ridiculous nonsense.
tgbugs 1342 days ago [-]
Many times when working with code that uses structs in Racket I have found myself fruitlessly grepping for the definition of some mysterious and troublesome function before eventually remembering to check whether there is a struct named foo with a member named bar somewhere. Normally this is not a problem so long as the file can make it far enough through the syntax phase that you can resolve the names. Yet somehow when the code is well and truly broken then somewhere, somehow, a struct will be involved without any way to resolve the member name. I say this having written my own symbol creating and binding macros before I realized the pitfalls and now dread having to go back and fix it. It seems like such a cool idea in principle.
dreamcompiler 1342 days ago [-]
It's a cool idea for programming and a terrible idea for software engineering.
Spivak 1342 days ago [-]
I feel like there needs to be a refinement that allows for something like

    (define-blob foo)
to produce the symbol FOO but no other symbols.
avmich 1342 days ago [-]
> CL is an extremely powerful language, it doesn't mean you should be using it all the time in your day-to-day work.

That's the problem of code guides - they try to avoid problems with some abuses, but to make a good judgement when some rare decision is justified is hard. So the guide misses the mark by making an approximate limitation - often on the safe side.

It has benefits to write on boring, safe subsets of languages. Still writing code guidelines is hard.

kmill 1342 days ago [-]
Did you see the justifications? For example,

> Not only does [INTERN] cons, it either creates a permanent symbol that won't be collected or gives access to internal symbols. This creates opportunities for memory leaks, denial of service attacks, unauthorized access to internals, clashes with other symbols.

It even has some advice on using wrappers for INTERN if you really need it.

The document has provisions for exceptions to the rules. There's discussion about using EVAL despite the fact the rule says you must not use it.

Also, "should avoid" means that you need a good reason, addressed in a comment and code review. Many examples you're probably thinking about are easily containers of elements of like type anyway (allowing mild cases of sum types and such). Though things tend to be more robust with intentionally created data types, I find.

oalae5niMiel7qu 1342 days ago [-]
All that can be solved by carefully choosing which package you INTERN the symbols into.

If you want to avoid having the symbols stick around forever, you can create a temporary package with MAKE-PACKAGE and then use DELETE-PACKAGE when you don't need it anymore.

kmill 1341 days ago [-]
You're right that the rule as stated is incorrect (and I internally corrected it to "it can either create a permanent symbol that won't be collected or give access to internal symbols", so for a moment I didn't know what you were responding to). I think the spirit of the rule is still good, which is raising these sorts of questions at design time and code review: Do you need INTERN? If so, are you INTERNing into the right package? If you are using a temporary package, are you remembering to DELETE-PACKAGE? Do you really need INTERN in an Internet-exposed interface? Does your use of INTERN prevent exposure of sensitive symbol plists to attackers? Does your code rely on private symbols to keep keyword arguments from being used in the public interface, and will INTERN accidentally circumvent it? Is the lifecycle of the temporary package sufficient to prevent DoS? etc.
1342 days ago [-]
dguaraglia 1342 days ago [-]
To be fair, all of Google's language guidelines revolve around making code easy to maintain, which means easy to read. Most of the other language guides also include such arbitrary restrictions that seem to go against the spirit of the language.
oalae5niMiel7qu 1342 days ago [-]
Google open-source code is repetitive and hard to read.
AnimalMuppet 1342 days ago [-]
>> You should avoid using a list as anything besides a container of elements of like type.

> Good-bye, code-is-data.

Could you regard that as a list of AST nodes?

logicchains 1342 days ago [-]
>I could reduce this guide by a good 30% with "You should avoid using Lisp as anything as Go or Java".

You should see their C++ style guide, it basically bans most of modern C++. Unsurprisingly enough, Google C++ libraries (like Tensorflow, or GoogleTest) are some of the ugliest open source C++ libraries out there.

Syzygies 1342 days ago [-]
> Symbol guidelines: You should use lower case

Like flipping through for the soft porn in a friend's "romance" novel, I must confess I searched straight for this guideline.

It astonishes me that Lisp systems still default to all caps. Of course one can quickly disable this, but why send the old gheezer "GET OFF MY LAWN!" message? That's exactly what people do, get off Lisp's lawn. They don't even get to the part where the parentheses (completely unnecessary for representing a tree in 2020) are a hazing exercise / loyalty test.

I love Lisp, but its public relations is the poster child for "How can people who are so smart be so dumb?"

sahil-kang 1342 days ago [-]
Can you share more info about the parentheses being a hazing test? I’ve seen Dylan syntax [1], but is there something else that shows the parentheses to be unnecessary in 2020?

[1] https://en.wikipedia.org/wiki/Dylan_(programming_language)

praptak 1342 days ago [-]
I've had good experience with Closure's cautious approach to parentheses.

It's not radical, in that it's still basically parentheses but their nesting is greatly reduced by combination of tricks:

Flattening, if reversible. E.g. if something's always a list of pairs, then it's represented as a flat list with pairing inferred.

Having shorthands in syntax, i.e. [a b c] is (vec '(a b c)).

WalterGR 1341 days ago [-]
In my admittedly limited experience with Clojure a few years ago, IIRC it seemed like it just replaced some parentheses with square brackets, but did not actually reduce the number of total brackets by much.
1342 days ago [-]
QuesnayJr 1342 days ago [-]
Your bigger point -- that Lisp is bad at public relations -- is probably true.

I do have to admit I like the look of interacting with the Lisp interpreter where my commands are in all lower case, and the responses is in all upper case. I may be part of the problem.

WalterGR 1342 days ago [-]
> ... the parentheses (completely unnecessary for representing a tree in 2020) are a hazing exercise / loyalty test. I love Lisp

Which Lisp do you use since you dislike parentheses so strongly? Dylan? Something else?

Syzygies 1342 days ago [-]
I use my own preprocessor for Chez Scheme. Comments begin flush left, code is indented. Indentation implies the parentheses that it can. "|" opens a group that self-closes at the next unbalanced ")" or the end of the line. "$" is a placeholder where there's no Lisp symbol, when one needs to imply several opening parentheses at once. Together this avoids Lisp's signature 17 car pileup at the end of every expression.

I look for the "$" or equivalent in any proposal out there, to see if the author has written lots of code or is just talking. It's like looking for bone marrow in beef stew, evaluating a cookbook. Marrow is central to the story of Lisp; we got our start being able to wield tools to crack open bones after lions and jackals had left a kill. The added nutrition allowed our brains to increase in size. Soon we mastered fire, then Lisp.

I need a Trojan horse, to release this. I helped computerize algebraic geometry. The old guard doesn't change its ways; new people show up with open minds. I want to write a language whose core ability is concise monadic parsing of trees, with macros its core strength rather than a bolt-on. Then I'll "happen" to use it in a definitive successor to Okasaki's book "Purely Functional Data Structures". Oh yeah, and use this alternate syntax. I see no other way to adoption.

ohazi 1342 days ago [-]
> Lisp's signature 17 car pileup at the end of every expression.

This is poetry

kazinator 1341 days ago [-]
It really is a car pile up. Growing in the cdr direction doesn't add to the pile up; it's the car-induced nesting.
contravariant 1342 days ago [-]
I feel like antagonizing you purely to see this happen.
oalae5niMiel7qu 1342 days ago [-]
> I use my own preprocessor for Chez Scheme.

LOL no read macros.

kmill 1341 days ago [-]
I'd like to see a sample of this if it would be possible. Through the years I've been tinkering with new odd Lisp dialects based on ideas I've picked up along the way, and this might be another thing to test out.

Haskell-like languages can look vaguely Lisp-like when fully parenthesized due to juxtaposition meaning function application, like in lambda calculus. They also tend to have "$" to open a group that self-closes at the end of the expression (implemented as a very low precedence right-associative operator), which can greatly clean things up. It sounds like your "|". Haskell itself has a little-used "literate comments" mode where everything not prefixed with > is a comment: https://www.haskell.org/onlinereport/literate.html

Mathematica leaves something to be desired when it comes to being a tree rewrite language. I do find it to be very useful for some of the knot theoretic calculations I'm interested in (for example, the language makes it fairly straightforward to work with the braided monoidal category of tangles up to the HOMFLY relations! and display morphisms as linear combinations of pictorial representations of basic tangles!) But... I'd like to be able to more easily compose different tree languages together without worrying that my rewrite rules will conflict with the standard library in unexpected ways.

Recently I've been learning Lean, which has been pleasantly ergonomic (yet still frustrating -- proof assistants are still in their adolescence). It's been making me think that languages of the future really need a built-in system that let you prove the correctness of your programs. This can help with basic things like subtypes, array bounds checking, and so on, that compilers can struggle with because they get little help from the user. Also, all the infix notation actually seems to improve legibility and ergonomics, though this would not be the case if Lean did not have an LSP server that lets you query for the definition of pretty much anything, including bespoke notation. It would be interesting to have a Lisp built on maybe dependent type theory where you can make sure your programs satisfy any property you might want, adding types and proofs to your program in an incremental way.

Another interesting thing about Lean is how it embraces a certain kind of metaprogramming, but in the form of a semi-inscrutable write-only language of "tactics". The UI presents the objects that you have and the objects that you want to have, and the various tactics help you perform steps to close the gap. This is more useful for proofs than programs (yes, proofs are programs, too, but they have some minor differences due to the propext axiom), but it makes me wonder if there is something in here that could be useful for writing programs in general.

(By the way, Macaulay 2 has been helpful to have around every once in a while when I need to do some basic calculations with modules. I was impressed how the language seemed to be designed to appear mathematician-friendly.)

oconnore 1342 days ago [-]
Julia has a full power macro system with “normal” syntax.
Syzygies 1342 days ago [-]
As does Nim.

There's a thrill to monadic parsing of text in Haskell. That thrill hasn't been extended in any language to algebraic data types. To me, "full" in any full power macro system I've seen means they've adopted every lesson that Lisp learned. I don't see anyone going on...

andi999 1342 days ago [-]
What are these lessons?
_bxg1 1342 days ago [-]
> Macros bring syntactic abstraction, which is a wonderful thing. It helps make your code clearer, by describing your intent without getting bogged in implementation details (indeed abstracting those details away). It helps make your code more concise and more readable, by eliminating both redundancy and irrelevant details. But it comes at a cost to the reader, which is learning a new syntactic concept for each macro. And so it should not be abused.

I really think this just applies to any kind of indirection - classes, functions, even named constants (vs literals).

skybrian 1342 days ago [-]
Syntax changes have different consequences for reading comprehension. You can often skip over a function call, making only a reasonable guess at what the function does and relying on invariants for all function calls. A macro can arbitrarily change the lexical environment of anything contained in it with few constraints, so reading other parts of a file without knowing what each macro does is more precarious.

And when it comes to navigating large amounts of code, you do need to stop somewhere; you can't do a depth-first read of everything before doing anything.

dreamcompiler 1342 days ago [-]
> I really think this just applies to any kind of indirection - classes, functions, even named constants (vs literals).

Disagree. Macros are fundamentally different in that they can change the syntax of the language. As such, they can inhibit the code's readability in ways the other defining forms cannot. Looking up something by name in CL is as simple as meta-dot. But you cannot meta-dot syntax. (Well, you could on a Lisp Machine, but not so much on current CL implementations.)

_bxg1 1342 days ago [-]
It's a difference of degree, not of kind. Any indirection forces the user to reason about a constructed concept instead of the literal facts of what's happening. By introducing one, you're making the assertion that the abstract concept is easier to reason about (including any effort required to learn it) than the contents being abstracted away.
TeMPOraL 1342 days ago [-]
> But you cannot meta-dot syntax.

You can C-c M-e the macro form to have SLIME expand it for you inline (read-only), and then navigate around it, using e to expand and c to collapse back. Super useful.

fiddlerwoaroof 1342 days ago [-]
> you cannot meta-dot syntax

I don't understand this? Besides the macrostep expander, meta-dot works on the macro name: reader macros _are_ harder to debug, but they're generally discouraged.

dreamcompiler 1342 days ago [-]
It requires more cognitive load to interpret new syntactic constructs. See brundolf's comment herein.
aidenn0 1342 days ago [-]
Right; all abstractions are only as good as they don't leak, but the question is "how easy is it to debug when it does leak?"

I think you listed classes, functions, and named constants in approximately the order of debugability too.

It can be unclear even which macroexpansions are in play from a backtrace, much less which one caused the breakage. (non inlined) functions are right there in the backtrace, and of course, debugging a named constant is as simple as typing its name into the REPL.

fiddlerwoaroof 1342 days ago [-]
In emacs, the macrostep expander solves most of the Macro debugging issues: you usually can expand a macro use, and see exactly what code is being generated or “refactor” the macro away by copying the expansion at a certain level of detail and replacing the original form with it.
aidenn0 1342 days ago [-]
Yes, this is a useful tool; I still maintain that debugging macros with this tool is harder than debugging functions with the various other SLIME tools.
lioeters 1342 days ago [-]
Indeed, for what are classes and functions but specific kinds of macro (roughly speaking); or, macros and classes as special kinds of functions..

I'd include overloading operators in that list. It can be convenient, but comes at a cost to newcomers to the codebase.

I suppose any kind of shortcut or abbreviation carries this risk, to increase the cognitive load of the reader - things they have to remember and mentally substitute the shortcuts until they become second nature.

(Oh, right, what we call "shortcut" and "indirection" are both examples of abstraction, its value and cost.)

erik_seaberg 1342 days ago [-]
If you don't define the macro, you will have to replace it with boilerplate everywhere you would have used it, and constantly rereading all that is more ongoing grunt work than learning what the macro does.
dreamcompiler 1342 days ago [-]
> You should favor iteration over recursion

I take slight issue with this. It's one of the ways Scheme programming style differs from Common Lisp style in general, and it makes sense. But I find the LOOP macro obtuse and ugly and many algorithms are simply clearer with recursion, so if I can write it tail-recursively and I know the implementation supports tail calls (most do), I often do that instead of using LOOP.

The other handy case for recursion is "try foo, and if it doesn't work tweak this argument and try foo one more time." There tail calls don't matter because max iteration count is 2.

LOOP is still critically important but sometimes it impedes rather than assists clarity.

kazinator 1342 days ago [-]
I made a macro tlet which lets you write pseudo-functions that look like labels syntax. Everything is done by tagbody/go under the hood.

http://www.kylheku.com/cgit/lisp-snippets/tree/tail-recursio...

With that, you don't even have to bother with ensuring that the calls are in tail position.

remexre 1342 days ago [-]
Have you tried iterate (wrt the loop sucks problem)
dreamcompiler 1342 days ago [-]
Yes but I've never used iterate enough to get comfortable with it. Maybe I should.
smabie 1342 days ago [-]
You probably shouldn't be using recursion either. maps, folds, and scans are the bread and butter of FP programming. tail recursion is almost as bad as a normal loop, and shouldn't be necessary most of the time. The most reasonable use of tail recursion is when the combination of the above operations doesn't have good enough time complexity, necessating that you do everything at once.

Though, when I programmed CL in the past I used the iterate macro, which I found a lot nicer than the ugly loop one.

dreamcompiler 1342 days ago [-]
CL has maps and folds (the latter it calls REDUCE) and I definitely prefer them to any form of iteration when they make sense.
dang 1342 days ago [-]
airstrike 1342 days ago [-]
Beat me to it! Link to the article in the 2012 submission is broken, so here's a web archive copy: http://web.archive.org/web/20130114221734/https://google-sty...

Seems like the exact same document which harks back to the ITA Software acquisition per comments at the time of that submission

ch_123 1342 days ago [-]
I assume (perhaps naively) that Google must have a non trivial amount of CL development if they have a style guide for the language... anyone know what they use CL for?
carry_bit 1342 days ago [-]
My guess: Google bought ITA Software about 10 years ago, and their search engine was written in Common Lisp.
danielam 1342 days ago [-]
Yes. The authors of this document were all engineers who worked on QPX/QRES.
shaftway 1342 days ago [-]
It doesn't take much need to write a style guide. And this one is rather lackluster. By comparison look at the shell style guide: https://google.github.io/styleguide/shellguide.html
gpanders 1342 days ago [-]
From the linked shell style guide:

> Indentation is two spaces. Whatever you do, don’t use tabs.

The recent obsession with 2 space indents is boggling to me. I find it much more difficult to read (especially in long blocks of code with lots of indentation switches) and I'm not even visually impaired.

They also apparently really don't like tabs, which I find interesting. Personally, I was converted to the tabs camp after learning how much better tabs are for accessibility. I'm surprised that Google of all places doesn't take that more seriously.

ahefner 1342 days ago [-]
Is 2 spaces really a recent obsession? This is just what emacs (and ancient emacs-likes) have always done, and how all the Lisp code (excluding that by newbies and in fringe dialects) has been formatted. CL best practice, formatting wise, is 95% just to let emacs do what it does by default.
sgeisenh 1342 days ago [-]
Obligatory disclaimer: comments are exclusively my own and do not reflect the views of my employer.

A lot of these rules exist for the purpose of having a consistent rule. With modern tooling, there is virtually no difference between tabs and spaces. What is important is to have some rule to avoid the constant bikeshedding that would occur if there wasn't one.

alaaalawi 1342 days ago [-]
jimbokun 1342 days ago [-]
Is it weird that I enjoy reading this, even though I haven't programmed in Common Lisp for a long while?
twblalock 1342 days ago [-]
Who here can talk about how widely used Lisp is at Google, and what it is used for?
butterisgood 1342 days ago [-]
Went looking for advice on concurrency/parallelism and error handling.
danielam 1342 days ago [-]
FWIW, speaking from memory, if QPX, Google's (previously ITA's) low airfare search engine, is any indication, then the lack of any real mention is not surprising because QPX did not make use of parallelization or concurrency. The way it was written made that move very difficult. From what I recall, there was interest in parallelizing some bits of the computation, but I don't recall it ever really going anywhere. (QRES, ITA's reservation system, which was also written in Common Lisp, may have made use of concurrency or parallelization in some way, but my knowledge of that system is limited.)

N.b., QPX did not quite follow all of these recommended practices during my tenure (e.g., ubiquitous SETFing of object slots).

aidenn0 1342 days ago [-]
The 2 word guide to concurrency and parallelism: Use fork().
rurban 1342 days ago [-]
nconc over append, lol. You cannot use append when the first list is long, it's way too slow.

The rest all makes sense.

lordgrenville 1342 days ago [-]
Slightly off-topic but is this official, and if so...why is Google hosting stuff on Github Pages? Seems sort of amateurish. Not to mention it belongs to a rival of theirs.
cbarrick 1342 days ago [-]
Google moved a lot of their open source projects to GitHub after they shut down Google Code. That was before the MS acquisition.

You can find a lot of smaller projects at https://github.com/google, and obviously there are some big ones like https://github.com/tensorflow/tensorflow.

Notable exceptions are Android and Fuchsia, which have their own hosted git repos at https://{android,fuschia}.googlesource.com.

lordgrenville 1342 days ago [-]
Thanks for clarifying. It makes sense to host repos on GitHub - it's so widespread - but what struck me as weird was using the.github.io domain, which I usually associate with amateur bloggers who don't want the hassle of registering a domain.
Jtsummers 1342 days ago [-]
GitHub hasn't always been owned by MS, I believe this was up on GitHub before the acquisition (along with a lot of other Google content after they closed up Google Code).

And yes, it's official. Google acquired ITA which was rather famous as a Common Lisp shop. This meant that they had acquired a substantial Common Lisp codebase. If you drop the document from the URL you end up at [0] which includes links to other language style guides.

[0] https://google.github.io/styleguide/

jgodbout 1342 days ago [-]
Most Google open sourced code is on Github. Generally the code made by developers isn't "official" it's just code (documents) made by people at Google.
slenk 1342 days ago [-]
Google doesn't have a product like that...
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 03:53:09 GMT+0000 (Coordinated Universal Time) with Vercel.