This is a Python library for time series data segmentation, specifically developed for clinical data. It includes the following components:
- Dimensionality reduction using Non-negative Matrix Factorization (NMF)
- Optimal number of clusters calculation using Silhouette score, Calinski Harabasz score, and Davies Bouldin score.
- Predictive modeling using Multilayer Perceptron (MLP) classifier, Support Vector Machines (SVM), and Random Forest.
- Explanation of cluster groups using SHAP values.
- Analysis and simulation of disease progression using skip grams and Markov chains, with visual representation of group likelihood changes.
To use the library, simply import it into your project and follow the steps outlined in the components above. Detailed usage instructions and examples can be found in the library's documentation.
The library requires the following dependencies:
- NumPy
- Pandas
- Scikit-learn
- SHAP
- nltk
- Matplotlib (for visual representation)
We welcome contributions to this library. If you have any suggestions or bug reports, please create a GitHub issue. If you would like to contribute code, please submit a pull request.
This library is available under the GNU General Public License Version 3.