c2d output upload#1263
Conversation
|
/run-security-scan |
|
/run-security-scan |
alexcos20
left a comment
There was a problem hiding this comment.
AI automated code review (Gemini 3).
Overall risk: medium
Summary:
This pull request introduces functionality for C2D compute jobs to upload their outputs to remote storage (e.g., URL-based storage) with optional AES encryption. It updates core interfaces for C2D output configuration, modifies the KeyManager to support external symmetric keys for AES encryption and recipient public keys for ECIES, and integrates the remote upload logic into the C2DEngineDocker. New validation for ComputeOutput encryption parameters is added, along with a comprehensive integration test to cover the remote upload scenario.
Comments:
• [INFO][other] The ComputeOutput interface has undergone a significant transformation, replacing multiple direct fields with remoteStorage and encryption objects. This design change, while powerful for enabling remote output, means that any existing code directly interacting with the ComputeOutput interface will likely require updates. However, the PR's approach of passing an ECIES-encrypted string of ComputeOutput in command handlers mitigates direct breaking changes at the API level, pushing the parsing/decryption internally. This is a good pattern for extensibility and security.
• [INFO][other] The key parameter has been added to encrypt and encryptStream. The current JSDoc comment for key is clear about its role for ECIES (recipient public key) and AES (symmetric key). This makes the KeyManager more flexible and reusable for different encryption contexts.
• [WARNING][security] For AES encryption, if no key is explicitly provided, the implementation falls back to using privateKey.raw (the node's private key) as the AES cipher key. While technically functional, reusing a private key directly as a symmetric encryption key for data is generally not considered a best practice in cryptography due to potential side-channel attacks or compromise of the private key if the symmetric key is exposed. It's often safer to derive a separate symmetric key using a Key Derivation Function (KDF) from a shared secret or a fresh random key. However, given that for ComputeOutput encryption, an explicit key from the ComputeOutputEncryption object is provided, this fallback might only apply to other internal uses of KeyManager.encrypt that do not provide a key. Please confirm if this fallback path is intended for critical data encryption or only for less sensitive internal usage.
• [INFO][other] The ECIES encryption logic now conditionally uses an explicitly provided key (interpreted as the recipient's public key in hex) or falls back to the node's own public key (new eciesjs.PrivateKey(privateKey.raw).publicKey.toHex()). This is a notable change from potentially always encrypting to the node's own public key, making the encrypt method more versatile. It's important that calling code (like in startCompute.ts) correctly provides the recipient's public key or relies on the appropriate default.
• [WARNING][style] Typo: Encrypted required, but no key should probably be Encryption required, but no key for better clarity.
• [INFO][other] The validateOutput method in CommonComputeHandler is a crucial addition. It ensures that the incoming output (encrypted string) can be decrypted and its contents (ComputeOutput) adhere to the expected structure and constraints (e.g., AES encryption method, key length). This pre-validation step enhances the robustness and security of the C2D compute start process.
• [INFO][other] The new integration test for 'should start a compute job with output to URL storage' is excellent. It covers the full lifecycle of remote output, including encryption, job execution, and verification by downloading the output. This provides strong confidence in the new feature's functionality.
• [INFO][other] The test uses http://172.15.0.7:80/ as the remote storage URL. It's important to confirm that the test environment (e.g., Docker Compose for integration tests) reliably provides an HTTP server at this specific IP and port that can handle file uploads and serve them for download. This setup is crucial for the test to pass consistently.
a28bb55
into
feature/ftp_storage_and_uploads
* add ftp support & upload for url/ftp/s3. Reorg tests * c2d output upload (#1263) * c2d output upload
C2D output upload (remote storage and encryption)
Summary
This PR extends Compute-to-Data (C2D) so that compute job outputs can be sent to remote storage (URL, S3, FTP, etc.) instead of only being stored locally on the node. The output configuration is sent as an ECIES-encrypted payload, and the C2D Docker engine decrypts it, streams the job’s output archive to the chosen storage backend, and optionally applies on-the-fly encryption to the uploaded stream (AES only; other ciphers may be added later). Test utilities and test fixes are included (e.g.
waitForAllJobsToFinish, use of minimal resources, andoutput: nullfor default behavior).Changes
1. ComputeOutput and encrypted output field
ComputeOutput(src/@types/C2D/C2D.ts):remoteStorage?: StorageObjectandencryption?: ComputeOutputEncryption.ComputeOutputEncryption: Only AES is supported for encrypting the upload stream. It hasencryptMethod: EncryptMethod.AESandkey: string(AES symmetric key, hex-encoded). Support for more ciphers may be added in the future.outputfield in the compute start command and inDBComputeJobis an ECIES-encrypted string that decodes to aComputeOutputobject. It is never stored in plain text in the database.output: null(or omitted) means “no remote output config”; the node keeps the default behavior (e.g. writing the outputs archive to local disk). Empty object is no longer treated as valid; tests were updated to useoutput: nullwhere no remote output is intended.2. Validation and start flow
CommonComputeHandler.validateOutput(src/components/core/compute/startCompute.ts):outputstring before starting the job: decrypts with the node’s KeyManager (ECIES) and parses asComputeOutput. Invalid or malformed payload returns 400 and the job is not started.3. KeyManager usage
keyManager.encryptStream()with the AES key fromoutput.encryption.keywhen encrypting the upload stream. Only AES is supported for this step.4. C2D Docker engine: remote upload and optional AES encryption
src/components/c2d/compute_engine_docker.ts):job.outputis set, the engine decrypts it to aComputeOutputobject and obtains the storage class foroutput.remoteStorage(reusing the same storage layer used elsewhere: URL, S3, FTP, etc.).hasUpload) and implementsupload(filename, stream):/data/outputsarchive (e.g. viacontainer.getArchive()) tostorage.upload().output.encryptionis set and has akey, the upload stream is passed throughkeyManager.encryptStream()with AES and that key before upload, so the remote file can be stored encrypted. Only AES is supported for this encryption.outputsArchivePath).ResultsUploadFailedand the error is logged.