Skip to content
Draft
68 changes: 68 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,74 @@ install packages api key: 12345
See [discussion here](https://github.com/SuffolkLITLab/docassemble-AssemblyLine/issues/69)


# Answer Set Import Safety Configuration

Answer set JSON imports are intentionally restricted to reduce risk from malformed and malicious payloads.

Default behavior:
- Plain JSON values are imported by default, and object reconstruction is allowed only for allowlisted DAObject classes.
- Top-level variable names must match `^[A-Za-z][A-Za-z0-9_]*$`.
- Internal/protected variable names are blocked.
- If `answer set import allowed variables` is not set, imports allow safe variable names by default, still block protected/internal names, and intersect with the target interview's known variables when AssemblyLine can detect them.
- Object payloads can be imported when classes are allowlisted; by default, known `docassemble.base` and `docassemble.AssemblyLine` DAObject descendants are allowed.

Default import limits (`assembly line: answer set import limits`):
- `max bytes`: `1048576` (1 MB)
- `max depth`: `40`
- `max keys`: `20000`
- `max list items`: `5000`
- `max string length`: `200000`
- `max number abs`: `1000000000000000` (`10**15`)

Final allowlist/config policy:
- Default allowlist: unset (`answer set import allowed variables` omitted), which falls back to safe-name/protected-name checks plus target-interview variable detection when available.
- Recommended production policy: set an explicit allowlist to only shared/reusable variables in your jurisdiction.
- `answer set import allow objects` defaults to `true`; set it to `false` if you want strict plain-JSON-only imports.
- `answer set import allowed object classes` can extend the default DAObject class allowlist with explicit additional class paths.
- Additional classes in `answer set import allowed object classes` apply to object envelopes at any depth (top-level variables and nested descendants).
- `answer set import remap known classes` defaults to `true`; this safely maps known class basenames from other packages (such as playground exports) onto official allowlisted classes.
- `answer set import class remap` can define explicit basename-to-class mappings for additional controlled remaps.

Example hardened configuration:

```yaml
assembly line:
enable answer sets: true
enable answer set imports: true
answer set import require signed: false
answer set import allow objects: true
answer set import remap known classes: true
answer set import limits:
max bytes: 1048576
max depth: 40
max keys: 20000
max list items: 5000
max string length: 200000
max number abs: 1000000000000000
answer set import allowed variables:
- users_name
- users_address
- users_phone_number
- users_email
- household_size
answer set import allowed object classes:
- docassemble.AssemblyLine.al_general.ALIndividual
- docassemble.AssemblyLine.al_general.ALPeopleList
- docassemble.AssemblyLine.al_general.ALAddress
answer set import class remap:
ALIndividual: docassemble.AssemblyLine.al_general.ALIndividual
ALPeopleList: docassemble.AssemblyLine.al_general.ALPeopleList
```

Notes:
- Keeping `answer set import require signed: false` matches current compatibility-first behavior; unsigned imports still pass strict structural validation.
- If your environment can manage signing keys, set `answer set import require signed: true` to require signed payloads.
- Class allowlisting uses full dotted class names (exact match), not wildcard patterns.
- Playground-authored classes usually need explicit allowlisting, e.g. `docassemble.playground1.al_general.ALIndividual`.
- If a playground package name changes across environments (for example `playground1` to `playground2`), update `answer set import allowed object classes` to match the runtime class path.
- With `answer set import remap known classes: true`, exports that use known class basenames (for example `docassemble.playground1.al_general.ALIndividual`) can be remapped to official allowlisted classes without instantiating the playground class.


# ALDocument class

## Purpose
Expand Down
3 changes: 2 additions & 1 deletion docassemble/AssemblyLine/al_courts.py
Original file line number Diff line number Diff line change
Expand Up @@ -454,7 +454,8 @@ def convert_zip(z: Any) -> str:
"address_zip": convert_zip
}
if hasattr(self, "converters") and self.converters:
assert isinstance(self.converters, dict)
if not isinstance(self.converters, dict):
raise TypeError("converters must be a dict")
merged_converters.update(self.converters)
to_load = path_and_mimetype(load_path)[0]
if self.filename.lower().endswith(".xlsx"):
Expand Down
8 changes: 4 additions & 4 deletions docassemble/AssemblyLine/al_document.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
from docassemble.base.pdfa import pdf_to_pdfa
from textwrap import wrap
from math import floor
import subprocess
import subprocess # nosec B404
import pikepdf
from typing import Tuple
import secrets
Expand Down Expand Up @@ -1104,7 +1104,7 @@ def as_pdf(
try:
main_doc.set_attributes(filename=filename)
main_doc.set_mimetype("application/pdf")
except:
except Exception: # nosec B110
pass

if self.need_addendum():
Expand Down Expand Up @@ -1976,7 +1976,7 @@ def get_cacheable_documents(
)
)
result["download_filename"] = filename_root + ext
except:
except Exception: # nosec B110
pass
results.append(result)

Expand Down Expand Up @@ -2855,7 +2855,7 @@ def ocrmypdf_task(

completed_ocr = None
try:
completed_ocr = subprocess.run(
completed_ocr = subprocess.run( # nosec B603
ocr_params, timeout=60 * 60, check=False, capture_output=True
)
to_pdf.commit()
Expand Down
8 changes: 4 additions & 4 deletions docassemble/AssemblyLine/al_general.py
Original file line number Diff line number Diff line change
Expand Up @@ -642,7 +642,7 @@ def normalized_address(self) -> Union[Address, "ALAddress"]:
"""
try:
self.geocode()
except:
except Exception: # nosec B110
pass
if self.was_geocoded_successfully() and hasattr(self, "norm_long"):
return self.norm_long
Expand Down Expand Up @@ -672,7 +672,7 @@ def state_name(self, country_code: Optional[str] = None) -> str:
if hasattr(self, "country") and self.country and len(self.country) == 2:
try:
return state_name(self.state, country_code=self.country)
except:
except Exception: # nosec B110
pass
try:
return state_name(
Expand Down Expand Up @@ -1037,7 +1037,7 @@ def phone_numbers(
elif len(nums):
return list(nums[0].keys())[0]

assert False # We should never get here, no default return is necessary
raise AssertionError("Unreachable: no default return is necessary")

def contact_methods(self) -> str:
"""Generates a formatted string of all provided contact methods.
Expand Down Expand Up @@ -2470,7 +2470,7 @@ def is_phone_or_email(text: str) -> bool:
validation_error("Enter a valid phone number or email address")
else:
validation_error("Enter a valid email address")
assert False, "unreachable"
raise AssertionError("unreachable")


def github_modified_date(
Expand Down
8 changes: 4 additions & 4 deletions docassemble/AssemblyLine/data/questions/al_document.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ code: |
key=action_argument("key"),
preferred_formats=preferred_formats,
)
email_arg = action_argument('email')
email_arg = action_argument("email")
if isinstance(email_arg, list):
email_str = ', '.join(email_arg)
email_str = ", ".join(email_arg)
else:
email_str = str(email_arg)
if email_success:
Expand Down Expand Up @@ -72,9 +72,9 @@ code: |
key=action_argument("key"),
preferred_formats=preferred_formats,
)
email_arg = action_argument('email')
email_arg = action_argument("email")
if isinstance(email_arg, list):
email_str = ', '.join(email_arg)
email_str = ", ".join(email_arg)
else:
email_str = str(email_arg)
if email_success:
Expand Down
37 changes: 30 additions & 7 deletions docassemble/AssemblyLine/data/questions/al_saved_sessions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ code: |
# HACK
# Create a placeholder value to avoid playground errors
al_sessions_snapshot_results = DAEmpty()
al_sessions_last_import_report = {"accepted": [], "rejected": [], "warnings": []}
---
initial: True
code: |
Expand Down Expand Up @@ -217,18 +218,42 @@ back button: False
---
id: al sessions load status
continue button field: al_sessions_load_status
comment: |
#TODO There's no error handling yet so this might be a lie
question: |
% if al_sessions_snapshot_results:
Your answer set was loaded
% else:
Your answer set was not loaded. You can try again.
% endif
subquestion: |
% if defined('al_sessions_last_import_report'):
% if al_sessions_last_import_report.get('warnings'):
${ collapse_template(al_sessions_import_warnings_template) }
% endif
% if al_sessions_last_import_report.get('rejected'):
${ collapse_template(al_sessions_import_rejected_template) }
% endif
% endif

Tap "next" to keep answering any unanswered questions and finish the interview.
back button: False
---
template: al_sessions_import_warnings_template
subject: Import warnings
content: |
% for warning in al_sessions_last_import_report.get('warnings', []):
* ${ warning }
% endfor
---
template: al_sessions_import_rejected_template
subject: Variables skipped during import
content: |
% for item in al_sessions_last_import_report.get('rejected', [])[:50]:
* `${ item.get('path', '?') }`: ${ item.get('reason', 'unknown reason') }
% endfor
% if len(al_sessions_last_import_report.get('rejected', [])) > 50:
* ${ len(al_sessions_last_import_report.get('rejected', [])) - 50 } more variables were skipped and are not shown here.
% endif
---
question: |
Upload a JSON file
subquestion: |
Expand All @@ -239,11 +264,9 @@ fields:
accept: |
"application/json, text/json, text/*, .json"
validation code: |
try:
json.loads(al_sessions_json_file.slurp())
except:
validation_error("Upload a file with valid JSON")
is_valid_json(al_sessions_json_file.slurp())
---
code: |
al_sessions_snapshot_results = load_interview_json(al_sessions_json_file.slurp())
al_sessions_import_json = True
al_sessions_last_import_report = get_last_import_report()
al_sessions_import_json = True
8 changes: 7 additions & 1 deletion docassemble/AssemblyLine/data/questions/al_settings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,4 +119,10 @@ code: |
---
code: |
# Can be an exact path or just a name, in which case we will search /usr/share/fonts and /var/www/.fonts for a matching file ending in .ttf
al_typed_signature_font = "/usr/share/fonts/truetype/google-fonts/BadScript-Regular.ttf"
al_typed_signature_font = "/usr/share/fonts/truetype/google-fonts/BadScript-Regular.ttf"
---
code: |
# Allow users to import answer sets from JSON files.
# The global config 'enable answer set imports' is checked first; this variable allows
# interview authors to disable imports at the interview level even if global config permits them.
al_allow_answer_set_imports = True
1 change: 1 addition & 0 deletions docassemble/AssemblyLine/data/questions/al_visual.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ data from code:
(
get_config('assembly line',{}).get('enable answer sets')
and get_config('assembly line',{}).get('enable answer set imports')
and al_allow_answer_set_imports
)
or (user_logged_in() and user_has_privilege('admin'))
)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"users_name": "Alex",
"city": "Boston",
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"users_name": "Alex",
"__class__": "builtins.object",
"city": "Boston"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"users_name": "Alex",
"_internal": {
"steps": 99
},
"city": "Boston"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{
"users": {
"_class": "docassemble.playground1.al_general.ALPeopleList",
"instanceName": "users",
"object_type": {
"_class": "type",
"name": "docassemble.playground1.al_general.ALIndividual"
},
"elements": [
{
"_class": "docassemble.playground1.al_general.ALIndividual",
"instanceName": "users[0]",
"name": {
"_class": "docassemble.playground1.al_general.IndividualName",
"instanceName": "users[0].name",
"first": "Client",
"last": "Example"
}
}
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"users": {
"_class": "docassemble.bad.Actor",
"instanceName": "users"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
{
"users": {
"_class": "docassemble.AssemblyLine.al_general.ALPeopleList",
"instanceName": "users",
"object_type": {
"_class": "type",
"name": "docassemble.AssemblyLine.al_general.ALIndividual"
},
"elements": [
{
"_class": "docassemble.AssemblyLine.al_general.ALIndividual",
"instanceName": "users[0]",
"name": {
"_class": "docassemble.base.util.IndividualName",
"instanceName": "users[0].name",
"first": "Client",
"last": "Example"
},
"agent": {
"_class": "docassemble.AssemblyLine.al_general.ALIndividual",
"instanceName": "spouse"
},
"custom_text": "notes",
"custom_float": 1.25,
"custom_dict": {
"_class": "docassemble.base.util.DADict",
"instanceName": "users[0].custom_dict",
"elements": {
"case": "A123"
}
}
},
{
"_class": "docassemble.AssemblyLine.al_general.ALIndividual",
"instanceName": "spouse",
"name": {
"_class": "docassemble.base.util.IndividualName",
"instanceName": "spouse.name",
"first": "Spouse",
"last": "Example"
}
}
]
}
}
Loading
Loading