This repository has been archived by the owner on Mar 14, 2019. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 237
Insert process codebase walk through
Rhys Bartels-Waller edited this page Jun 3, 2015
·
5 revisions
The following list represents just the key phases in a successful insert process since the comprehensive flow can be reviewed in the source
There are three primary phases in the insert, with the process differing slightly on the first two if the file is sourced from the client.
- Inserting into the underlying collection
- Transferring data into the TempStore
- Making copies into the supplied stores
- File is validated based on defined filters.
- [1] Document inserted into underlying Minimongo collection (suffixed with .files)
- [2] Upload of the file data begins (implementing this method) where it is chunked and queued using a queue from the http-upload package. Each task in the client's queue is a sub-queue containing the chunks for each file.
- Each chunk received via the HTTP PUT request is piped to the registered TempStore
Remaining steps are common
- File is validated based on defined filters.
- [1] Document inserted into underlying Mongo collection (suffixed with .files)
- [2] The read stream for the file is piped to the registered TempStore
Remaining steps are common
- [3] FileWorker observes the collection for files that have been uploaded but not yet stored to trigger saveCopy.
- A TempStore readStream, with optional write and read transformations, is piped to each defined store.
- FileWorker also observes the collection for files where each store is complete, triggering the removal of the TempStore file into each defined store.
Proposed changes to this page from the work in PR 667
- In memory queue to govern the server side TempStore transfer.
- [3] When the TempStore writeStream ends, the TempStore already emits a
stored
event, which the collection hears and then emits anuploaded
event. This event is heard within the same process, and creates asaveCopy
job per store in a persistent queue. If using GridFS or S3 for the Stores (TempStore included), any Worker in a separate process can complete the job by observing the queue. - A separate collection-observer package observes and triggers the creation of a
removeTempFile
job when all stores are complete. This needs to be isolated to a single instance in the system to avoid duplicates.
footer25555555