Design documents

This category contains design documents written by the Lix team, which may or may not be implemented.

regexp engine investigation

nix uses libstdc++'s std::regex. it uses whatever version of libstdc++ the host system has.

which it invokes in both std::regex_replace std::regex_match modes.

nix occasionally uses the flags std::regex::extended and std::regex::icase which determine the available features - it's always either no flags, or both of these together. there's also a couple things that use the flag std::regex::ECMAScript. when the constructor is called without a flags parameter, the flags default to std::regex::ECMAScript (see method signature in C++23 32.7.2), so really we have only two cases.

std::cregex_iterator and std::sregex_iterator are used.

there's a header regex-combinators.hh which defines regex::group and regex::list.... and a couple others that are unused. but those are just trivial textual things, not extensions, so we can ignore the file.

getting the C++ standard

someday when C++23 is official you will be able to pirate the PDF. otherwise, you can clone https://github.com/cplusplus/draft and check out the tag n4950 which is the current formally adopted working draft as of 2024-03-14 and is intended to have the same technical content as the final standard. you can then invoke make in the source subdirectory which will produce std.pdf. you will need LaTeX installed. if you're ever not sure which working draft is the one that became a particular version of the standard, Wikipedia will probably tell you...

(personally I install texlive.combined.scheme-full from nixpkgs on all my machines that have room for it, but this is surely more than necessary, it just makes me feel warm and fuzzy -- Irenes)

chapter 32 is the one that documents regular expressions.

open questions that require reading the standard

required functionality

the extended flag, per the C++ standard, "Specifies that the grammar recognized by the regular expression engine shall be that used by extended regular expressions in POSIX.". it references POSIX, Base Definitions and Headers, Section 9.4.

the ECMAScript flag "Specifies that the grammar recognized by the regular expression engine shall be that used by ECMAScript in ECMA-262, as modified in [section 32.12 of the C++ standard]." it references ECMA-262 15.10. the changes in 32.12 are important and probably do create real compatibility issues for us, though fortunately it's only a single page.

if we complete this chart we can use it to assess which existing engines would meet our needs, or how much of a pain in the ass it would be to make a new one

the columns are the two ways it gets invoked

extended + icase ECMAScript
Syntactic constructs -- --
(TODO: fill in every construct here)
Semantics -- --
Case-insensitivity yes ?
(TODO: fill in other behaviors here)

Dreams

This page documents the dreams of the Lix team. These are features which we have generally not roadmapped yet, and which may not have complete and thoroughly thought-through plans, and which we would like to think about more completely before implementing. We are writing them down publicly so that others can dream with us.

fixing ux

slaying the hydra

these are problems that make hydra sad

Language versioning

This document is extremely a draft. It needs some editing and discussion before it can be made into a useful thing. It's been simply copy pasted out of the pad in its current form.

See also

musings

puck: honestly, having language version as part of a scopedImport-style primop would be funny horrors: we're shitposting about setting language version from the source accessor

horrific writeup

basic mechanism

add a new syntactic element that is only valid at the head of a file and used only to declare language requirements. nix versions that cannot satisfy all requirements must reject this element to situations in which two nix versions parse the same file differently, or even evaluating the same file to different derivation hashes. any kind of comment as used by eg GHC is not viable for nix for this reason.

proposed syntax for the first implementation: use $( $feature: ident )+;

anything ahead of this directive could be either unversioned nix code or versioned nix code (see below for details), but since the directive is only valid at the head of a file or expression this "code" can only be comments. this kind of locks us into supporting the current comment syntax forever, but the comment syntax is rather fine so this won't be a problem.

each feature may declare a syntactical requirement for the file, a semantic requirement, or possible both (cf rust editions, or perl use v<something>).

features may be global, namespaced to their implementations, or live in a reserved experimental namespace an implementation can add to and remove from as it wishes with absolutely no guarantee of future evaluatility.

syntactic features

syntax is entirely local to the file itself and has few to no intercompatibility constraints with other code. a very useful syntax requirement would something like no-url-literals, which might strip the syntactic ability to parse url-like sequences of characters into strings and, rather than nix currently does the experimental feature of the same name simply throwing a parse error, parse them as eg a lambda with a sequence of divisions in its body.

(realistically no-url-literals would not appear in practice, instead it should be implied by use itself since url literals are such an obvious misfeature)

semantic features

semantic features produce evaluation changes that could be achieved any other way. examples of this are:

semantic changes may escape the expression that requires them and usually some of amount of cross-compatibility with other semantic versions must be given. using the same examples as above, considerations can include:

this is in fact a full classification of cross-compatibility issues: side-effects changing, evaluation outputs changing, and evaluation inputs changing. side-effects need not be considered very much since nixlang is supposed to be pure and all side-effects that are not part of the store interface must already be considered incidental. evaluation outputs changing can be handled by optional lint or runtime warnings when a versioned evaluation structure passes a semantic version boundary without being annotated as an intentional behavioral leak. evaluation inputs changing is a non-issue because nix plugins and the ExternalValue infrastructure already make it impossible to rely on the type system being fully specified at the time an expression is written

inter-file inter-actions

by default language features must not be propagated across an unadorned import boundary to retain compatibility with existing nix code (eg nixpkgs, which will not be able to switch for quite some time). in some circumanstances it is however required to propagate language features across imports to provide a consistent and meaningful interface, eg in the case of a hypothetical requiredLanguageFeatures attribute for a flake. to allow for both of these requirements to peacefully coexist we add a new primop:

scopedImportUsing
:: { features ? <current language features> :: AttrSetOf bool
     ## ^ language features as would be specified by `use ...;`.
     ## selecting a default-off feature is achieved by setting its key to `true`,
     ## deselecting a default-on feature is achieve by setting its key to `false`.
     ## nesting is not needed because features are identifiers. future changes to
     ## the use interface may extend the type of this set.
   , newGlobals ? env: env :: AttrSetOf Any -> AttrSetOf Any
     ## ^ function to produce the new global environment. it receives the default globals
     ## set for the target expression language features (as calculated form `features` and
     ## the target `use` clause) and produces a new set.
     ## `scopedImport` behavior is recovered by setting this to `const newEnv`.
   }
-> PathLike
## ^ imported path as in `scopedImport`
-> Any
## ^ import result. may be cached, most immediately using the intransparent internal
## object id of the provided features and the globals set. this mimics the beavior
## or `import` in cppnix

if the imported expression selects a different set of language features the features specified by scopedImportUsing are ignored.

scopedImportUsing is available in the builtins set and crucially, can be replaced. this allows a hypothetical flake implementation to replace both scopedImportUsing and import with its own versions that provide propagation behaviors that might be expected from such a library:

additionally the current language features might be made available through a builtin value languageFeatures by such a replacement of scopedImportUsing.

builtins versioning, global versions

a language feature may add or remove elements of builtins or the global environment. as mentioned earlier this does not pose a large hazard since evaluation is sufficiently unespecified that this must already be expected to happen.

interactions with eg nixpkgs lib

nixpkgs lib (and other libraries) will have to cater to the smallest common denominator when exposing library functions/constants as they do now. if we change a function to have a different prototype and a library reexports it from builtins to its own namespace the language features used by the code importing the library do not matter. to make this problem less unbearable we may want to introduce a concept of library objects and a "use library" directive like eg python from ... import ... that can pass language features down to the library being imported in some way.

as a first approximation is would be sufficient to encourage libraries to version their namespaces in such a way that accessing a namespace that relies on language features not present in the current evaluator will fail to evaluate (eg by providing the library itself as a plain set and each version as an attribute that (lazily) imports the specific version of the library needed to fulfill the requested version).

bad ideas for features to remove/change in the first langver

Feature detection

jade: I think we might want to be able to feature detect certain features, e.g. new builtin args, which can be done without, but we would like to know if they are there.

builtins.nixVersion has been defanged, which means that an alternate cross impl compatible mechanism needs to be created.

Minimally thought-through proposal

builtins.features is an attribute set, where individual attribute names are exposed with the value true if they are implemented by a given implementation.

Attribute names are of the format:

"domainname.feature", for example, "systems.lix.somefeature".

Docs rewrite plans

Here, for now (public edit link): https://pad.lix.systems/lix-docs-planning