-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathTODO
More file actions
808 lines (740 loc) · 38.5 KB
/
TODO
File metadata and controls
808 lines (740 loc) · 38.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
* Zipper rework
* Formalize trait structure
* Implement value-level equivalent to traverse's structural walk
* i.e. A traversal where each step yields a String, either from an L or a Var
* Implement operations equivalent to existing compiler functionality
* Integrate with Expression and replace existing zipper impl
* Should be able to rework certain existing traversals as recursive descents
backed by utility methods on Expression in order to reduce abstraction footprint
* Implement utility methods for handling subscopes
* Labeled products, and corresponding project / unlabel terms
to lift the inner program to the root after interpretation
* Implement auto-loaded core
* Synthesize a mixin statement in its own layer
prior to import resolution?
* May be better to do manually until effect type interpretation is working
* Program structure rework
* Data and metadata should be monolithic
* Allows early data injection
* ex. So intrinsics, core, etc can provide their own pre-checked data
and reduce typechecking work
* Program is effectively a state monad over data and zipper
* Should probably be elided in favour of site-local zipper usage,
with the newly-monolithic program data as a function parameter
* Interpret effect type data within a restricted context
* Use inline label to identify at parse time, and hoist up out of root / module scope
for effect types that require sandboxing away from runtime code
* Allows use of expressions, utility methods for constructing variants, etc
* Each effect type should be able to supply its own context of utility methods
* i.e. The operator function
* Delimiter refactor
* Once effect types are interpreted at compile time,
should be feasible to finish implementing delimiter definition
* Allow omitting the 'operator' field, so (...) can
apply directly instead of indirecting through the app function
* Factor Empty out of type and row ASTs
* As-is, () is simply acting as another term
* Equivalent to Int, Float, etc, in that it's just an identifier
for some opaque unary kind
* Adds extra complexity to typechecking
* At value level, could be replaced with a term
* At type level, can be replaced with the corresponding unary kind
* At predicate level, can simply be replaced with an empty ClosedRow,
as predicates don't care about products and sums
* To wit, add () as an intrinsic type, remove the AST entries,
and fix up any resulting holes in the parse grammar and test suite
* Finalize interpreter
* Fix remaining test failures
* Will need to reimplement alpha renaming to guarantee
that product labels don't accidentally overwrite parent function scope
* Should replace the split between local and structural,
which breaks in certain circumstances
* Likely also applies to the typechecker?
* Factor out Result if possible; semantically, should not be able to fail post-typechecking
* Reify unary type construction
* Essentially a means to mark a given label as being a free-standing
term with a built-in type of matching identifier
* i.e. Proper definition for the () intrinsic,
or representations of backend-specific machine types
* Implement as an effect type that populates the appropriate
backend data prior to typechecking
* Prune unused bindings
* Generalized alternative to modifying imports at the Import site
* Would have to track modifications across multiple imports
and produce a final composite, potentially with conflicts
or other elements of unsoundness
* Since the intent is to avoid importing unused code,
better to defer the process to an automated pass
* Reference counts can be taken during name resolution,
and used to drive a pruning pass that removes unused code
to reduce typechecking workload
* Will be more feasible once alpha renaming is implemented
* Implies that alpha renaming will have to occur before name resolution
* May be more tractable to attach indexed identifiers to
instances of the same name, avoid changing the textual representation
and potentially confusing the user
* Mixin functionality
* Implement as an effect type:
* Mixin = /module/path = ()
* Behaves similarly to Import, but instead of replacing itself
with Expression::Var(/module/path), creates a series of aliases
using the qualified path:
* ex:
member_a = (<~ /module/path).member_a
member_b = (<~ /module/path).member_b
member_c = (<~ /module/path).member_c
...
* Can use the innermost value as an optional list of aliases to create
* When left empty, aliases everything in the module
* ex:
Mixin = /module/path = {
member_b = (),
member_c = (),
}
* Formalize unlabeled and unvalued rows
* Once empty type is reified, can implement as syntax sugar
* ex. (1 =) would resolve to (1 = ()),
(= 1) would resolve to (() = 1)
* How does this interact with a potential implementation of type variables?
* With type variables, = would become a binary operator,
and thus conflict with this manner of syntax-level sugar
* May instead be able to implement custom checking / unification
rules to allow a precurried instance of the = operator
to unify with the appropriate unlabeled / unvalued row
* Is this unnecessarily magical?
* If the two concepts can be considered distinct,
they should probably have distinct syntax
in the absence of a purely structural solution
* Could this be considered purely structural?
* Functions, and thus curried functions, are values
* But, they have a distinct semantic from final concrete values
* Could this be solved by parametrizing delimiter operators to their separator?
* i.e. Comma denotes regular rows
Something else auto-lifts to unlabeled rows
Something else auto-lifts to unvalued rows
* Reify implementation of core language primitives
* Core types must be representable entirely in-language
* Establishes a separation between pure logical types and their machine representations
* Makes the language self-sustaining for basic computations,
* Can be detected by backends and replaced with the appropriate machine type
* Set
* Simple unlabeled valued rows
* ex:
{1, 2, 3} = {
Set = 1,
Set = 2,
Set = 3,
}
* Array
* Arbitrarily-labeled valued rows
* Effectively behaves like a double-ended queue thanks to directional projection
* ex:
[1, 1, 1, 2, 2, 3] = {
Array = 1
Array = 1
Array = 1
Array = 2
Array = 2
Array = 3
}
* Untagged enum
* Simple labeled unvalued rows
* Accumulator set(?)
* Scoped labeled unvalued rows
* Map
* Simple labeled valued rows
* a.k.a. Structs in traditional languages
* Multi-Map
* Scoped labeled valued rows
* Newtype
* Nested labels
* ex:
Inner(T) = (Inner = T)
Wrapper(U) = (Wrapper = U)
Wrapper(Inner(Value)) = (Wrapper = Inner = Value)
* Boolean
* Untagged enum
* ex:
[
False : ()
True : ()
]
* Numbers
* Fundamental Digit
* Since scoped rows implicitly encode a quantity,
use a peano-style approach with unlabeled unvalued scoped rows:
* ex:
0 = ()
1 = (
Digit = ()
)
2 = (
Digit = ()
Digit = ()
)
* Can represent any natural number
* Should act as the lowest-level number type,
but likely not be used in a first-class manner
due to O(n) scaling
* May be more succinctly representable with
type constructors and types-from-terms
* i.e.
(Digit :: Int -> *)
Digit 0 = ()
Digit N = ( Digit = (), (Digit N - 1) )
* However, what are 0 and N in this context?
* Chicken-and-egg problem; would need representable
numbers in order to have a type-level integer term
* Proven tractable via value level function by peano numbers,
so may prove similar at the type-level
* In hindsight, may also be representible using the 0-9 Glyph types,
but that would require intrinsic logic for algebraic operations,
which is less expressive and powerful than encoding them in-language
* Base-N Digit
* Product of two fundamental digits, the first representing value,
and the second representing the numeric base at which the value should wrap
* ex:
BaseNDigit = (
Value = (
Digit = ()
Digit = ()
)
Base = (
Digit = ()
Digit = ()
Digit = ()
)
)
* Can represent only N numbers, thus bounding its O-scaling
* Base-N Number
* A sequence of base-n digits represented as an unlabeled scoped row
* More practical for first-class use than fundamental or base-n digits,
as arbitrary numbers can be represented,
and O-scaling becomes a function over base and arity
* ex:
123 = (
BaseNDigit = (
Digit = ()
)
BaseNDigit = (
Digit = ()
Digit = ()
)
BaseNDigit = (
Digit = ()
Digit = ()
Digit = ()
)
)
* Signed Number
* A product of boolean sign and numeric type
* ex:
-3 = {
Sign = False = ()
Number = (
BaseN = (
Digit = (),
Digit = (),
Digit = ()
)
)
}
* Should be applicable to any underlying numeric type
* Introduces the need for typeclass-like functionality
* Decimal number type
* A product of two base-n number types
* ex:
5.3 = (
Value = (
BaseN = (
Digit = ()
Digit = ()
Digit = ()
Digit = ()
Digit = ()
)
)
Frac = (
BaseN = (
Digit = ()
Digit = ()
Digit = ()
)
)
)
* Similar to signed, should be applicable to any underlying numeric type
* Characters
* Effectively just a function from glyphs to natural numbers
* Would be nice to establish a semantic separation between the two,
represent glyph sets (ASCII, UTF tables, etc) via sum of simple unvalued rows,
and use a function to convert them into the appropriate representation
* Due to touching many common variable names,
would be good practice to scope glyph names inside another label
to prevent collision during composition
* Moreover, scoping within a name establishes the semantic of being
a value-level character
* In effect, a Glyph represents a 'machine character'
* i.e. A member of some character set (ASCII, UTF, etc) on the compiler's host system
* Must ultimately be stored in memory somewhere,
and thus implicitly actually be the number represented in the character set spec,
but abstracts this fact away behind a pure notion of "member of the set of glyphs";
any single atom that can make up the textual representation of the language
* Therefore, Char takes that lowered notion and lifts it back up
to the compiler-local notion of a machine character,
thus enriching the simple atom with language functionality
* ex:
Glyph = [.., Y = (), Z = (), ..]
'Z' = ( Char = Z = () )
make_char = a :: Glyph -> Char = a
char_to_num = [
..
Y :: (Char = Y = ()) => 89
..
]
* Strings
* Scoped product of char values
* Notably not an Array, since Char has its own label namespace,
but could be promoted to one by wrapping each sub-row in the appropriate label,
or demoted to a Glyph array by replacing the Char label with the Array label
* ex:
"Hello" = (
String = Char = H = ()
String = Char = e = ()
String = Char = l = ()
String = Char = l = ()
String = Char = o = ()
)
['B', 'y', 'e'] = (
Array = Char = B = (),
Array = Char = y = (),
Array = Char = e = (),
)
* Labels?
* Following from notions of Glyphs and Strings,
a label would be an array of glyphs
* Can't 'unwrap' another level without losing ordering,
so drilling down further by removing the preceding '() =' is infeasible
* ex:
Bye = [B, y, e] = (
Label = B = (),
Label = y = (),
Label = e = (),
)
* This gives rise to a concrete notion of labels as values,
and by extension, any representible value as a potential label
* In-language definition of literal parsing?
* Given the above formation of robust text-handling types,
and the fact that literal values begin their life as characters in parser input,
could the recognition and conversion process be handled in-language?
* Likely via effect type, i.e. an expression that is detected via special case
during parsing, and triggers an intermediary pass that typechecks
some code in isolation and interprets it against the literal to produce a value
* This raises the notion of values as a first-class AST members
* Would likely subsume InternedString use sites like Var, Abs input, Label, Unlabel
* However, would need a comparable interning process to
avoid regressing and duplicating a lot of data
* Also touches on AoT rebinding needed for in-language typeclass implementation;
similar idea of using an effect type to run language code for some specific purpose
inside the compiler
* In that case, would take a function that takes an option-wrapped label value
(to account for the value in question not yet being bound)
and outputs some new label value, which is then bound above the root
* w.r.t. generalization:
* Labels are mandatory, as the process deals in bindings
* However, gives rise to constrained variations that elide the option
and / or label, thus promising to operate over an existing or specific value
* Worth considering products of functions as the fundamental type
* Would allow elegant overriding / disallowing of multiple impl
by using simple / scoped rows, possibly extending to
ordered overrides by keying on numbers
* Works without needing the ability to project sum handlers
in order to update with new impls, since said handlers
can be generated by a function over the product of handlers
* What row semantic should the root use?
* Can either be fixed, or controllable via effect type
* Fixed is justifiable, since you can bind
a product of arbitrary semantic within it
* However, controllable is still useful, and thus preferable
* i.e. To allow arbitrary module semantics,
similar potential for overriding via rebind,
or disallowing it programnatically
* What would such an effect type look like?
* Simplest version is an enum describing the desired semantic,
but generalizes to a function that returns
a value which is then used as the basis for build_ast
* If rebinding is akin to a lens or prism, this would be akin to
the constructor for the lensed data
* Formalize core library, prelude functionality
* If a function can be represented textually, it should live in a textual library
that gets automatically added to the root product, rather than as an inline intrinsic
* i.e. inj / proj construct the Project / Inject intrinsics,
so can't be represented textually, but wrappers over them can
* To wit, should the intrinsics be lower-level?
* i.e. proj_* = a -> <proj_*> a
* Probably wiser to use the simplest-possible representation
* Best practice, keeps options open
* Can still have nice sum API via textual core
* What other notable intrinsics could exist?
* Label
* Postfix operator, creates a closure that produces a labeled value
* Would require label variables
* Unlabel
* Prefix operator, creates a postfix closure that unlabels a labeled value
* Would require label variables
* Product / Sum
* Delimiter operators, lifts a row into the respective type
* Would require row variables
* What other notable core functions could exist?
* Update
* Restrict
* Concat
* Branch
* Refactor directional projection operators to use proj / inj primitives
* Feasible now the interpreter can cope with them
* AST builder API
* Will need a nice way to write expressions directly in rust
without resorting to a parsing macro
* Some sort of chained function call API that can construct nodes
with unpopulated members, and map inward in a way that lines up
with the language syntax
* App / combinators may be useful here
* While parsing, AST should be constructed node-by-node,
with type holes used to represent unparsed elements
* Currently works this way at the module level, but not for individual expressions
* Proper type name handling and recursivity for type annotations
* Currently checking names to establish linkage to type variables,
but only on an annotation-by-annotation basis
* Needs to work in a scoped fashion, so names are reused
all the way down the tree
* Bindings in nested scopes should be able to use type variables
and predicates defined in parent annotations
* Type variables with bound names should use them during logging
* To wit, types need their own binding environment
* Existing TypeEnvironment is a mapping of value names to their types,
* This would be a mapping of type names to their type IDs
* Currently exists in implicit form through a combination of
global intrinsic ID -> name bindings, and annotation-local name bindings
* Need to reify into a proper environment struct comparable to TypeEnvironment
* Should also solve the ambiguity between bound / unbound and concrete / free
* i.e. Types can be named (bound) separately from being concrete or free
* Effectively the foundational machinery for type aliases
* Once emplaced, it should simply be a matter of adding
syntax to inject a named type into the environment
within a given scope - i.e. inside a product
* To wit, the existing type annotation syntax could
be repurposed; value bindings would be typed via inline :: syntax,
and type bindings would be created via : syntax
* Lines up with the Rust approach of 'type XYZ' to inject a type binding,
which can be used via 'fn(a: XYZ)', 'let x: XYZ = ...', etc.
* Implement per-node annotation
* Use :: infix syntax
* Will need to change the typechecker to check for an annotation at every node,
but that should be feasible
* Should likely replace existing binding-annotation system,
since bidirectional inference is robust enough to handle granular typing
* Is it worth adding a flipped infix syntax for prefix application?
* ex. Int::2
* Possibly useful for expressing the type of a large node ahead of its value definition
* May not be necessary given label and abs input annotation sugar,
though arguably still applicable for large nested products,
especially when mixin behaviour is implemented
* Unsure if viable to parse if using the same operator, though it would be preferable
* Worst-case, could use :> and <: or similar to differentiate
* Given that predicates are now stored per-node, how feasible is it
to include them in this syntax rather than special-casing it at bindings?
* More robust scoped_product and scoped_sum tests
* Need to cover a range of projections / injections
* Middle element of a three-arity label subset
* Mixed subsets
* Implement label annotation sugar
* i.e. label::Int : 1 <=> label: 1 :: label: Int
* Implement abs input annotation sugar
* i.e. a::Int -> a <=> a -> a :: Int -> a
* Where the rightmost a would be substituted during inference
* Full a::Int -> a::Int annotation would result in (a -> a :: Int -> Int)
* Switch value-level label binding from = to :
* Consider label variants
* ex:
* Label / Value
* Label / Empty
* Empty / Value
* Empty / Empty
* Also worth considering separate equality semantic
* Label Only
* Value Only
* Label / Value
* Strictly speaking, Label / Empty and Empty / Value can only use the data they have
* i.e. Label / Value is the only one that can meaningfully opt between the three
* Empty / Empty is effectively an explicitly singleton product / sum,
which overlaps somewhat with the existing empty type polymorphism
* May be better to have Empty / Empty become first-class eventually,
since it's a structural representation rather than an arbitrary one,
and there's no real use case i can think of for polymorphism between products and sums
* Seems like some sort of dependent typing?
* {} and [] are both Empty / Empty by necessity
* The semantic of { .. } and { .. } is decided by the labels inside them
* Is it sane to mix-and-match different label variants within the same product / sum?
* If each label variant + equality semantic combo has a distinct representation,
they shouldn't interact, and thus should be safe to allow
* If viable, this is both simpler and more powerful than
forcing each product / sum to contain only labels with a matching semantic
* Conversely, the empty type can be thought of as a superfeature of
an Empty / Value, and a theoretical empty label (ex. ()) can
be thought of as a superfeature of a Label / Empty
* Ergo, it's probably not necessary to invent distinct representations
* Thus, the equality semantic is the actual interesting variant
* Lines up better, since only Label / Value is able to meaningfully opt between semantics
* Also seems more honest to stick with the minimal Rose syntax and avoid new terms
* Could concrete values be implicitly considered as 'unlabeled labels' via typechecker special-casing?
* i.e. Allow any type T to unify with (() = T)
* Once label variables are implemented, the inverse may also be possible
* i.e. Allowing any label type L to unify with (L = ())
* Would allow transparent use with existing product / sum machinery
* Sum is particularly interesting, since given a type equality semantic,
would allow writing type-specific impls for functions like add
* ex:
add = [
a::Int -> [
b::Int -> add_int_int a b
b::Float -> add_int_float a b
]
a::Float -> [
b::Float -> add_float_float a b
b::Int -> add_float_int a b
]
]
* How to handle differences in equality semantic?
* Doesn't seem viable to mix-match, as different semantics within the same label variant are incomparable
* Ergo, tie to sum / product types
* How to represent this syntactically?
* Type parameter on product / sum constructors?
* Probably best represented as a prefix
* <={} / >={} / =={}?
* l{} / t{} / s{}?
* Label {} / Type {} / Strong {}?
* Wrapper around row types, once they have an AST representation
* Or, variant stored in row type
* Consider type constructors
* Z axis of the lambda cube
* Not equivalent to subtyping; invents new types from existing ones
* Improves expressibility, potentially allows for recursive types
* Would allow the () type to be expressed in-language (as a core element) instead of as an explicit AST node
* In theory would also allow the same for intrinsics
* Would be useful for backend-specific intrinsic defs, ex. rust u8, u16, f32, f64, etc in their own library
* Consider dependent types
* X axis of the lambda cube
* Types depending on terms
* ex. Vector Int 4
* 4 is a value term
* Doesn't add more representational power,
but allows for greater specification power
* Should be comparable to recursive haskell value matching in a sense;
able to go from (Vector Int 4) to ( () = Int & () = Int & () = Int & () = Int ), etc
* Fix w_combinator test
* Fails type inference with disjoint rows
* Think on integrating rust functions in such a way that interpretation and transpilation
can use them transparently
* ex. Functions over intrinsic types; boolean and numeric operations, etc.
* Need some typeclass-equivalent functionality that works with the row system
* Should Bool, Int, Float actually be thought of as opaque references to
a product containing the value and various implementation details?
* ex. How to make an int that knows how to add itself to a float, or vice-versa?
* May be able to take advantage of product-local scope to precurry the value
* i.e:
# make_ functions construct 'Rowclass impls' of a sort,
# with a label-tagged field containing the value,
# and specialized impls against other types stored as sum applicators
int_add_int : { a: Int, b: Int } -> Int
int_add_float : { a: Int, b: Float } -> Float
float_add_float : { a: Float, b: Float } -> Float
float_add_int : { a: Float, b: Int } -> Float
make_int : Int -> { r }
make_int = a -> {
int = a,
impls = {
add = [
b -> int_add_int { a: int, b: b.int },
b -> int_add_float { a: int, b: b.float },
]
},
add = r -> (<~ impls).add (~< (~> r))
}
make_float : Float -> { r }
make_float = a -> {
float = a,
impls = {
add = [
b -> float_add_int { a: float, b: b.int },
b -> float_add_float { a: float, b: b.float },
]
},
add = r -> (<~ impls).add (~< (~> r))
}
+ @ Infix 6
+ = a -> b -> (<~ a).add b
# These would likely be elided in favour of compiler-generated code,
# similar to haskell int literals emitting from_int calls
int = make_int 1
float = make_float 2.0
app_int_int = int + int
app_int_float = int + float
app_float_float = float + float
app_float_int = float + int
* Model intrinsic functions as annotations without an implementation
* Use product types and some corresponding reflection / translation
code on the rust side to model native functions as explicitly uncurried
* May need to introduce some new parser structure that
stores annotation and expression as options
* Should be extensible given a few lens updater functions,
since sum types (i.e. the contents of impls) can be branched arbitrarily
* Could further generalize with labels-as-types / label variables
* The dispatcher function (add, in this case) could be reduced to:
call = impls -> label -> r -> (<~ impls).label (~< (~> r))
add = call impls add
* In this case, call could be moved into std/core
* Suspect the semantics have shaken out such that modifying
such a structure's value row should still yield the correct results for
any precurried functions - i.e. the scoping is structural, rather than lexical
* Unsure if the sum application can infer the correct impl
from the injection of the projection of the input rowclass implementor
* May be tractable with some special-case typechecker code
and corresponding interpreter / transpiler implementations
* Implement label variables
* ex. For abstracting over structuring / destructuring functions
* Extend parser to be able to handle <label> notation
* More generally, need to fix the omission of < and > from builtins
* Current setup isn't specific enough, causes failure with composite builtins
* Implement label equality predicate
* Generate open labels and predicates for Labeled and Unlabeled during check / infer
* Implement unification and substitution of open labels
* Implement as Type::Label(Label) where Label is an enum with Open and Closed variants
* Use angle brackets to denote a label type
* ex.
structure : <l> -> a -> (l = a)
structure = l -> a -> l = a
destructure : <l> -> a -> (l = a) -> a
destructure = l -> a -> (~> a).l
* Labels as types allows them to be passed around as regular values,
and should be restrictable via the typechecker to prevent misuse
* Should be trivial for the interpreter,
but would require an additional lowering pass before transpilation
* Parse-wise, should be implementable as a bucket case for literals
* Parallelism for module typechecking and interpretation
* Store as a tree instead of a flat list, dispatch to worker threads at branches
* Essentially the same approach as cargo
* Figure out why closure_converted typechecks successfully if 'next = clos.next' is changed to 'next = clos.missing'
* Should fail to check, given that 'missing' is not present
* Could be a factor of the concrete impl being commented out due to the issue above
* Inferred as unexpected - but theoretically valid - type
* However, still doesn't explain why concrete application typechecks correctly
* Closure Conversion
* May be able to achieve this using the interpreter,
if expression transformations are representible in-language
* This would be very much preferable, but unsure if truly feasible
* Lambda Lifting
* Product Lifting
* Similar to lambda lifting,
except a (potentially colliding) name already exists,
and the liftee may be any expression rather than just an abs
* Compilation
* Unsure if lowering to a different form of Lambda Calculus is the correct approach
* Compilation favours the simplest possible representation
* i.e. Lowering through System F+*, System F, Simply Typed, etc.
until you have something that can run on a register machine or similar automata
* However, compilation is not a primary goal
* Transpilation favours the representation which most closely matches the target lang
* Preferable to preserve as many language features as possible
to give the target compiler the best chance of generating optimal code
* Rust supports sums, products, polytypes, and modules,
but not row polymorphism or modules as values.
* Ergo, would need to monomorphize row types into concrete structs,
establish conversion rules for module member imports etc
* Naga supports products, but not sums, polytypes, modules, or row polymorphism
* Ergo, would need to monomorphize variant dispatch,
polytypes, and variant dispatch, and combine everything into one module
* Thusly, some obvious transformations present themselves:
* Reference chasing
* Simplify the AST by filtering out unused expressions
* Module elision
* Inline to only product modules for Rust
* Inline to a single module for Naga
* Type monomorphization
* Detect concrete invocations of polymorphic expressions,
generate corresponding concrete instances
* Product monomorphization
* Use typechecking evidence to generate instances of concrete
product type permutations, look up via HashMap / BTreeMap
* Will require a concrete set of entry points
* Also worth considering fixed-permutation polymorphism here;
provide a set of types to the codegen backend and generate permutations
* i.e. For generating polyadic field functions in elysian
* Sum monomorphization
* Treat each sum member as a free-standing type
* Generate corresponding free-standing function impls
for variant elimination
* Label elimination (w.r.t. Product and Sum handling above)
* Each label-type pairing becomes a single concrete type
* Can likely use type aliases in Rust,
perhaps struct definitions or special variable naming in Naga
* All of the above can likely be represented with the existing AST
by applying implicit constraints on the variants it is allowed
to represent, augmenting its inner Row and Type types,
and providing external data structures for metadata storage
* Interpretation is flexible
* On some level, shares simplicity properties with compilation,
but any representation can be made executable so long as its
model is representible via dynamic dispatch
* Also preferable to retain features here for the sake of better UX
* In practical terms:
* Given an entry point, inline all variables to produce
a complete expression tree
* Transform the expression tree into a DAG,
and substitute duplicated expressions with references to a single
subtree
* Solve expressions from the leaves upward, using runtime code to handle language features
(structuring / destructuring polymorphic rows, concrete type impls, etc)
until a final value is reached
* Consider syntax sugar
* Elision of repeated -> for nested Abs nodes
* i.e. Making (a b -> a b) legal
* Inline predicates in type signatures
* (foo : { ra * rb }) instead of (foo : ra * rb ~ rc => { rc })
* Would need to be arbitrarily recursive
* (foo : { ra * rb * rc }) equivalent to (ra * rb ~ rd => rd * rc ~ re => re), etc.
* Other examples from the Rose paper
* Some sort of combined project-unlabel
* Or, some way to make project
* Root-level annotation syntax
* i.e. (: a :: Int => a), equivalent to (<this file> : a :: Int => a)
* Consider a cargo-equivalent CLI tool
* For managing dependencies, workspaces, streamlined compilation, etc
* Should be able to use a git repo to remotely host libraries
* Doesn't need to be extensible as per crates.io, at least not in the near-term
* Consider contributing to thunderseethe's repo via PR
* diff_and_unify disjoint check
* Existing codebase unifies (la : ta) * rb ~ (ra : tb), which is unsolvable
* Can be fixed by checking diff row equality vs sub and goal,
and throwing an error if either matches
* Technically not a problem for the existing codebase,
as it has no means to apply user-defined predicates,
but will become a problem as and when it gets there
* Note that in the presence of an empty row type,
this logic should only be applied if the diff is not empty
* Updated diff_and_unify sub / goal search
* Existing codebase is ill-defined w.r.t. row theory;
allows construction of rows with duplicate keys,
but fails to unify when attempting to use them
* Can be promoted to typed scoped row theory by:
* Searching for matching labels
* Attempting to unify each one
* Picking the last successful unification
* Note that typed scoped row theory is distinct from
scoped row theory as per the ROSE paper,
as it scopes per-label-and-type rather than just per-label
* This is comparable in a sense to unlabeled row theory,
where types are used as the discriminator
* Row occurs check
* Row normalization can proceed infinitely if
unifying a row variable with a closed row containing said variable
* Can be fixed by implementing an occurs check over rows and row variables,
traversing through the type hierarchy as needed