Skip to content

ThirtySomething/DiCoSto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

DiCoSto - DistributedContainerStorage

Some minds about a possible storage system.

Motivation

Today you can get a lot of free cloud storage. There are for example Dopbox, GoogleDrive or Microsoft OneDrive to name the big players. Not to talk about the uncoutable instances of Nextcloud, ownCloud or any other type of private clouds. So you get here 2 GiB, there 10 GiB and so on. It is no problem to get 200 GiB or more in total size of free online storage. But it's really difficult to use this storage in a reasonable way.

LICENSE

Copyright 2021 ThirtySomething

This file is part of DiCoSto.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

See also attached file LICENSE..

DiCoSto

DiCoSto could solve the problem of the cloud junks - as a mind game without real implementation. DiCoSto is a DistributedContainerStorage. The overall storage is represented by distributed containers. A container is a file of definable but fixed size at one location. A container can be located locally, on various network shares (NFS/SMB) and/or on various online services (Google Drive, One Drive, Dopbox, ...). DiCoSto offers an API to add/remove containers. Also an API to create, read, update and/or delete files inside the defined overall storage.

Basically DiCoSto is a virtual file system. To the user it's a simple file system, on the backend it works with containers distributed on different services and/or locations. Each container is similar to a block device. Adding a container means, that after the physical creation as a file, a file system, e. g. ext4 is created inside this container. DiCoSto mounts these containers/file systems and presents them as a single file system to the user.

Benefits

A realization of DiCoSto could take advantage of well known and approved software parts like the VFS or the internal used file system like ext4. Using C/C++ as programming language makes it portable for different platforms like x86-64 (Windows, Linux), ARM with Linux on systems like Raspberry Pi, on smartphones running Android or iOS.

Caveeats

There is the question about the credentials. Every cloud storage and maybe some of the other file systems like NFS or SMB will need them. How to deal with them? On the one hand, DiCoSto should be user friendly, but on the other hand also secure.

Maybe additional to the encryption of credentials it's a good idea to use encrypted file systems inside the containers to increase the security. This will be a possible drawback for the performance.

DiCoSto depends on a working internet connection. This could be also a drawback and excludes DiCoSto from some possible usage scenarios.

For instance the user adds a large file and DiCoSto distributes it finally on more than one of the containers. What happens in case one of the used online services is down? The user cannot access this file. This could be solved by using a software RAID but this requires more containers and will reduce the usable size for the user. Additional scrubbing could be required to keep the data consistent. Both, the RAID and also scrubbing will affect the performance.

Final thoughts

The DiCoSto file system seems appealing at first glance. However, the question arises as to its usefulness - is the scenario shown in the motivation really relevant? If there really is a need for such a file system, implementation will not be easy. Despite many reusable elements from the Linux kernel, it will not be a simple undertaking. Just the idea of a RAID system with files in multiple online services seems quite complex to me.

So if someone wants to realize this idea - I wish good luck and success. I would be happy to participate in the project as a tester, possibly also as a developer. I could also take over the documentation.

Until then I am waiting for feedback like: "Challenge accepted!".

Technical minds

From my point of view some kind of listener is required. The listener should register all kind of access to the storage. While I've started (and meanwhile removed) my implementation, some things popped up in my mind.

The basis storage container could be something like iSCSI. This is a well known technique already used for SAN's. This will offer a block device over the network.

In consequence this means that this listener should

  • Support iSCSI
  • Support RAID
  • Support access to the storage
  • ...

I should think much more about this.