[IMP] export: reduce JSON size#7756
Conversation
7516ed3 to
489a7be
Compare
489a7be to
d6405b8
Compare
d6405b8 to
e3bdd35
Compare
LucasLefevre
left a comment
There was a problem hiding this comment.
first review 👌
main point (but not the easiest): BananaCompiledFormula should be type-safe to raise at compile time instead of raising in production
pro-odoo
left a comment
There was a problem hiding this comment.
Nice :D
Didn't fully read the squisher yet
3224c45 to
1f3038b
Compare
hokolomopo
left a comment
There was a problem hiding this comment.
👋
Mostly small things. I didn't read the details of the squisher/unsquisher. Really cool feature 🥳
e1b7c7f to
0adc821
Compare
4791b75 to
c277d63
Compare
|
robodoo rebase-ff |
|
Merge method set to rebase and fast-forward. |
LucasLefevre
left a comment
There was a problem hiding this comment.
deeper look at the squish/unsquish
7c961d0 to
3313428
Compare
| const endPos = toCartesian(end); | ||
| for ( | ||
| let col = Math.min(startPos.col, endPos.col); | ||
| col <= Math.max(startPos.col, endPos.col); |
There was a problem hiding this comment.
can be evaluated a single time; same for the condition below
| numbers: string[], | ||
| strings: string[], | ||
| references: string[] | ||
| ): Partial<SquishedFormula> { |
|
robodoo r+ |
- it manages the tokens privately - it tells you if your formula is valid or not - it replaces 2 classes: FormulaCellWithDependencies and RangeReferenceToken - it is required for the next commit: allow to create multiple formulas based the same formula with different parameters compiler.ts/`compile(formula: string)` --> `CompiledFormula.Compile(formula: string, sheetId: UID, getters: CoreGetters)` fomulas/helpers.ts/`getFunctionsFromTokens(...)` --> `compiledFormula.getFunctionsFromTokens(functionNames: string[], getters: CoreGetters)` removed `RangeCompiledFormula` --> `CompiledFormula` removed plugins/core/cell.ts/`ReferenceToken` class --> it is implemented in CompiledFormula` formulas/helper.ts/`isExportableToExcel` & `getFunctionsFromTokens` --> compiledFormula.`areAllFunctionsExportableToExcel` & `getFunctionsFromTokens` removed FormulaCellWithDependencies and ReferenceToken Task: 5489478 Part-of: #7756 Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
Today all cells have the property `content` that contain the raw content of the cell: either its value, its formula text or the markdown link. Having the content re-create the formula from its parts is costly, and having the convenience of defining the property `content` on an object, either through a dedicated class or though a closure is very expensive (100 MB on large models). This PR move the property `content` out of Cell, in `LiteralCell` so it is no longer available on FormulaCell. Update all the usages of `content` to be specific to LiteralCell (!FormulaCell). Mostly, content was used to check if a cell is empty. This check was `if(!cell?.content)` and becomes `if (!cell?.isFormula && !cell?.content)` Task: 5489478 Part-of: #7756 Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
Description:
Reduce the JSON file size of spreadsheets by storing only changes to formulae and, in a second pass,
collecting all the formulae or changes that are the same and storing them under one single key in the JSON file.
importing sheets:
~2300ms --> ~680 ms
1560 MB ram --> 580 MB
JSON Size:
40 MB -> 147 KB
4,4 MB -> 3KB
1,6 MB-> 43 KB
```cells: {
A1: "My Text",
A2: "=SUM(B2:B9)",
A3: "=SUM(B2:B9)",
```
As we can see, A2 and A3 share the same formula. They can be rewritten as
```cells: {
A1: "My Text",
'A2:A3': "=SUM(B2:B9)",
```
removing duplication completely
Usually formula do not repeat exactly the same, but offset their dependencies (being either references to other cells, numbers or strings) slightly. We can rewrite:
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: "=CONCAT(B3, $C$2, D3, "hello"),
A4: "=CONCAT(B4, $C$2, D4, "hello"),
```
to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: { R : "R+1|=|R+1" },
A4: { R : "R+1|=|R+1" },
...
```
now we see that A3 and A4 have the same transformation, so we can rewrite them to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
"A3:A4": { R : "R+1|=|R+1" },
```
We apply the same for number arguments and string arguments.
A formula with all the arguments slightly changed might look like
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: { R : "R+1", N: "+1", S: ["hello"] },
A3:A4: { R : "R+1", N: "+1"}
```
can be read as
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: "=MY_FORMULA(2, B2, "hello"),
A3: "=MY_FORMULA(3, B3, "hello"),
A4: "=MY_FORMULA(4, B4, "hello"),
```
closes #7756
Task: 5489478
Signed-off-by: Rémi Rahir (rar) <rar@odoo.com>
|
@VincentSchippefilt @rrahir staging failed: ci/runbot (view more at https://runbot.odoo.com/runbot/batch/2378120/build/101554677) |
- it manages the tokens privately - it tells you if your formula is valid or not - it replaces 2 classes: FormulaCellWithDependencies and RangeReferenceToken - it is required for the next commit: allow to create multiple formulas based the same formula with different parameters compiler.ts/`compile(formula: string)` --> `CompiledFormula.Compile(formula: string, sheetId: UID, getters: CoreGetters)` fomulas/helpers.ts/`getFunctionsFromTokens(...)` --> `compiledFormula.getFunctionsFromTokens(functionNames: string[], getters: CoreGetters)` removed `RangeCompiledFormula` --> `CompiledFormula` removed plugins/core/cell.ts/`ReferenceToken` class --> it is implemented in CompiledFormula` formulas/helper.ts/`isExportableToExcel` & `getFunctionsFromTokens` --> compiledFormula.`areAllFunctionsExportableToExcel` & `getFunctionsFromTokens` removed FormulaCellWithDependencies and ReferenceToken Task: 5489478
Today all cells have the property `content` that contain the raw content of the cell: either its value, its formula text or the markdown link. Having the content re-create the formula from its parts is costly, and having the convenience of defining the property `content` on an object, either through a dedicated class or though a closure is very expensive (100 MB on large models). This PR move the property `content` out of Cell, in `LiteralCell` so it is no longer available on FormulaCell. Update all the usages of `content` to be specific to LiteralCell (!FormulaCell). Mostly, content was used to check if a cell is empty. This check was `if(!cell?.content)` and becomes `if (!cell?.isFormula && !cell?.content)` Task: 5489478
Description:
Reduce the JSON file size of spreadsheets by storing only changes to formulae and, in a second pass,
collecting all the formulae or changes that are the same and storing them under one single key in the JSON file.
importing sheets:
~2300ms --> ~680 ms
1560 MB ram --> 580 MB
JSON Size:
40 MB -> 147 KB
4,4 MB -> 3KB
1,6 MB-> 43 KB
```cells: {
A1: "My Text",
A2: "=SUM(B2:B9)",
A3: "=SUM(B2:B9)",
```
As we can see, A2 and A3 share the same formula. They can be rewritten as
```cells: {
A1: "My Text",
'A2:A3': "=SUM(B2:B9)",
```
removing duplication completely
Usually formula do not repeat exactly the same, but offset their dependencies (being either references to other cells, numbers or strings) slightly. We can rewrite:
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: "=CONCAT(B3, $C$2, D3, "hello"),
A4: "=CONCAT(B4, $C$2, D4, "hello"),
```
to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: { R : "R+1|=|R+1" },
A4: { R : "R+1|=|R+1" },
...
```
now we see that A3 and A4 have the same transformation, so we can rewrite them to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
"A3:A4": { R : "R+1|=|R+1" },
```
We apply the same for number arguments and string arguments.
A formula with all the arguments slightly changed might look like
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: { R : "R+1", N: "+1", S: ["hello"] },
A3:A4: { R : "R+1", N: "+1"}
```
can be read as
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: "=MY_FORMULA(2, B2, "hello"),
A3: "=MY_FORMULA(3, B3, "hello"),
A4: "=MY_FORMULA(4, B4, "hello"),
```
Task: 5489478
3313428 to
a34c02c
Compare
|
robodoo r+ |
- it manages the tokens privately - it tells you if your formula is valid or not - it replaces 2 classes: FormulaCellWithDependencies and RangeReferenceToken - it is required for the next commit: allow to create multiple formulas based the same formula with different parameters compiler.ts/`compile(formula: string)` --> `CompiledFormula.Compile(formula: string, sheetId: UID, getters: CoreGetters)` fomulas/helpers.ts/`getFunctionsFromTokens(...)` --> `compiledFormula.getFunctionsFromTokens(functionNames: string[], getters: CoreGetters)` removed `RangeCompiledFormula` --> `CompiledFormula` removed plugins/core/cell.ts/`ReferenceToken` class --> it is implemented in CompiledFormula` formulas/helper.ts/`isExportableToExcel` & `getFunctionsFromTokens` --> compiledFormula.`areAllFunctionsExportableToExcel` & `getFunctionsFromTokens` removed FormulaCellWithDependencies and ReferenceToken Task: 5489478 Part-of: #7756 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
Today all cells have the property `content` that contain the raw content of the cell: either its value, its formula text or the markdown link. Having the content re-create the formula from its parts is costly, and having the convenience of defining the property `content` on an object, either through a dedicated class or though a closure is very expensive (100 MB on large models). This PR move the property `content` out of Cell, in `LiteralCell` so it is no longer available on FormulaCell. Update all the usages of `content` to be specific to LiteralCell (!FormulaCell). Mostly, content was used to check if a cell is empty. This check was `if(!cell?.content)` and becomes `if (!cell?.isFormula && !cell?.content)` Task: 5489478 Part-of: #7756 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
Description:
Reduce the JSON file size of spreadsheets by storing only changes to formulae and, in a second pass,
collecting all the formulae or changes that are the same and storing them under one single key in the JSON file.
importing sheets:
~2300ms --> ~680 ms
1560 MB ram --> 580 MB
JSON Size:
40 MB -> 147 KB
4,4 MB -> 3KB
1,6 MB-> 43 KB
```cells: {
A1: "My Text",
A2: "=SUM(B2:B9)",
A3: "=SUM(B2:B9)",
```
As we can see, A2 and A3 share the same formula. They can be rewritten as
```cells: {
A1: "My Text",
'A2:A3': "=SUM(B2:B9)",
```
removing duplication completely
Usually formula do not repeat exactly the same, but offset their dependencies (being either references to other cells, numbers or strings) slightly. We can rewrite:
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: "=CONCAT(B3, $C$2, D3, "hello"),
A4: "=CONCAT(B4, $C$2, D4, "hello"),
```
to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: { R : "R+1|=|R+1" },
A4: { R : "R+1|=|R+1" },
...
```
now we see that A3 and A4 have the same transformation, so we can rewrite them to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
"A3:A4": { R : "R+1|=|R+1" },
```
We apply the same for number arguments and string arguments.
A formula with all the arguments slightly changed might look like
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: { R : "R+1", N: "+1", S: ["hello"] },
A3:A4: { R : "R+1", N: "+1"}
```
can be read as
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: "=MY_FORMULA(2, B2, "hello"),
A3: "=MY_FORMULA(3, B3, "hello"),
A4: "=MY_FORMULA(4, B4, "hello"),
```
closes #7756
Task: 5489478
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
- it manages the tokens privately - it tells you if your formula is valid or not - it replaces 2 classes: FormulaCellWithDependencies and RangeReferenceToken - it is required for the next commit: allow to create multiple formulas based the same formula with different parameters compiler.ts/`compile(formula: string)` --> `CompiledFormula.Compile(formula: string, sheetId: UID, getters: CoreGetters)` fomulas/helpers.ts/`getFunctionsFromTokens(...)` --> `compiledFormula.getFunctionsFromTokens(functionNames: string[], getters: CoreGetters)` removed `RangeCompiledFormula` --> `CompiledFormula` removed plugins/core/cell.ts/`ReferenceToken` class --> it is implemented in CompiledFormula` formulas/helper.ts/`isExportableToExcel` & `getFunctionsFromTokens` --> compiledFormula.`areAllFunctionsExportableToExcel` & `getFunctionsFromTokens` removed FormulaCellWithDependencies and ReferenceToken Task: 5489478 Part-of: #7756 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
Today all cells have the property `content` that contain the raw content of the cell: either its value, its formula text or the markdown link. Having the content re-create the formula from its parts is costly, and having the convenience of defining the property `content` on an object, either through a dedicated class or though a closure is very expensive (100 MB on large models). This PR move the property `content` out of Cell, in `LiteralCell` so it is no longer available on FormulaCell. Update all the usages of `content` to be specific to LiteralCell (!FormulaCell). Mostly, content was used to check if a cell is empty. This check was `if(!cell?.content)` and becomes `if (!cell?.isFormula && !cell?.content)` Task: 5489478 Part-of: #7756 Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>
Description:
Reduce the JSON file size of spreadsheets by storing only changes to formulae and, in a second pass,
collecting all the formulae or changes that are the same and storing them under one single key in the JSON file.
importing sheets:
~2300ms --> ~680 ms
1560 MB ram --> 580 MB
JSON Size:
40 MB -> 147 KB
4,4 MB -> 3KB
1,6 MB-> 43 KB
```cells: {
A1: "My Text",
A2: "=SUM(B2:B9)",
A3: "=SUM(B2:B9)",
```
As we can see, A2 and A3 share the same formula. They can be rewritten as
```cells: {
A1: "My Text",
'A2:A3': "=SUM(B2:B9)",
```
removing duplication completely
Usually formula do not repeat exactly the same, but offset their dependencies (being either references to other cells, numbers or strings) slightly. We can rewrite:
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: "=CONCAT(B3, $C$2, D3, "hello"),
A4: "=CONCAT(B4, $C$2, D4, "hello"),
```
to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
A3: { R : "R+1|=|R+1" },
A4: { R : "R+1|=|R+1" },
...
```
now we see that A3 and A4 have the same transformation, so we can rewrite them to
```cells: {
A1: "My Text",
A2: "=CONCAT(B2, $C$2, D2, "hello"),
"A3:A4": { R : "R+1|=|R+1" },
```
We apply the same for number arguments and string arguments.
A formula with all the arguments slightly changed might look like
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: { R : "R+1", N: "+1", S: ["hello"] },
A3:A4: { R : "R+1", N: "+1"}
```
can be read as
```cells: {
A1: "=MY_FORMULA(1, B1, "coucou"),
A2: "=MY_FORMULA(2, B2, "hello"),
A3: "=MY_FORMULA(3, B3, "hello"),
A4: "=MY_FORMULA(4, B4, "hello"),
```
closes #7756
Task: 5489478
Signed-off-by: Lucas Lefèvre (lul) <lul@odoo.com>

Description:
Reduce the JSON file size of spreadsheets by storing only changes to formulae and, in a second pass,
collecting all the formulae or changes that are the same and storing them under one single key in the JSON file.
Task: 5489478
API changes:
compiler.ts/
compile(formula: string)-->CompiledFormula.Compile(formula: string, sheetId: UID, getters: CoreGetters)fomulas/helpers.ts/
getFunctionsFromTokens(tokens: Tokens[], functionNames: string[])--> fomulas/helpers.ts/getFunctionsFromTokens(compiledFormula: CompiledFormula, functionNames: string[])removed
RangeCompiledFormula-->CompiledFormulamodel.ts/
export()-->export(shouldSquish?: boolean = false)cell.ts/
getTranslatedFormulaCell(sheetId: UID, offsetX: number, offsetY: number, tokens: Token[])-->getTranslatedFormulaCell(sheetId: UID,offsetX: number,offsetY: number,compiledFormula: CompiledFormula | SerializedCompiledFormula )removed plugins/core/cell.ts/
ReferenceTokenclass --> it is implemented in CompiledFormulaformulas/helper.ts/isExportableToExcel&getFunctionsFromTokens--> CompiledFormula.areAllFunctionsExportableToExcel&getFunctionsFromTokens`removed FormulaCellWithDependencies and ReferenceToken
Observed performance gains on large dashboard:
importing sheets:
~2300ms --> ~680 ms
1560 MB ram --> 580 MB
JSON Size:
40 MB -> 147 KB
4,4 MB -> 3KB
1,6 MB-> 43 KB
How does it work:
Shared formula
As we can see, A2 and A3 share the same formula. They can be rewritten as
removing duplication completely
Formula with small differences
Usually formula do not repeat exactly the same, but offset their dependencies (being either references to other cells, numbers or strings) slightly. We can rewrite:
to
now we see that A3 and A4 have the same transformation, so we can rewrite them to
We apply the same for number arguments and string arguments.
A formula with all the arguments slightly changed might look like
can be read as
When reading the compressed JSON, the order of the keys in the JSON itself do not matter, they are sorted on the first part of the key, so either the cell reference or the left part of the range.