Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

measureByteSize() gets called twice in EvaluatePrequential for every stats collection cycle #272

Open
nuwangunasekara opened this issue Mar 7, 2023 · 0 comments

Comments

@nuwangunasekara
Copy link
Collaborator

nuwangunasekara commented Mar 7, 2023

measureByteSize() gets called twice in EvaluatePrequential for every stats collection cycle:

                double RAMHoursIncrement = learner.measureByteSize() / (1024.0 * 1024.0 * 1024.0); //GBs
                RAMHoursIncrement *= (timeIncrement / 3600.0); //Hours
                RAMHours += RAMHoursIncrement;
                lastEvaluateStartTime = evaluateTime;
                learningCurve.insertEntry(new LearningEvaluation(

especially at:

This could result in high computing overhead on periodic stats collection for ensemble methods like SRP, and ARF with large number of base learners (100).

Simple test with default SRP parameters and default stream:

moa.DoTask "EvaluatePrequential -l meta.StreamingRandomPatches -i 100000 -f 10000 -q 10000"

  • MOA master 6eacf9b
    Task completed in 6m24s (CPU time)

  • Time after commenting the first occurrence:

diff --git a/moa/src/main/java/moa/tasks/EvaluatePrequential.java b/moa/src/main/java/moa/tasks/EvaluatePrequential.java
index 8003489..16b51c8 100644
--- a/moa/src/main/java/moa/tasks/EvaluatePrequential.java
+++ b/moa/src/main/java/moa/tasks/EvaluatePrequential.java
@@ -213,7 +213,7 @@ public class EvaluatePrequential extends ClassificationMainTask implements Capab
                 long evaluateTime = TimingUtils.getNanoCPUTimeOfCurrentThread();
                 double time = TimingUtils.nanoTimeToSeconds(evaluateTime - evaluateStartTime);
                 double timeIncrement = TimingUtils.nanoTimeToSeconds(evaluateTime - lastEvaluateStartTime);
-                double RAMHoursIncrement = learner.measureByteSize() / (1024.0 * 1024.0 * 1024.0); //GBs
+                double RAMHoursIncrement = 0.0 / (1024.0 * 1024.0 * 1024.0); //GBs
                 RAMHoursIncrement *= (timeIncrement / 3600.0); //Hours
                 RAMHours += RAMHoursIncrement;
                 lastEvaluateStartTime = evaluateTime;

Task completed in 5m7s (CPU time)

  • Time after commenting both the occurrences:
diff --git a/moa/src/main/java/moa/classifiers/AbstractClassifier.java b/moa/src/main/java/moa/classifiers/AbstractClassifier.java
index f60467d..30636a2 100644
--- a/moa/src/main/java/moa/classifiers/AbstractClassifier.java
+++ b/moa/src/main/java/moa/classifiers/AbstractClassifier.java
@@ -185,7 +185,7 @@ public abstract class AbstractClassifier extends AbstractOptionHandler
         measurementList.add(new Measurement("model training instances",
                 trainingWeightSeenByModel()));
         measurementList.add(new Measurement("model serialized size (bytes)",
-                measureByteSize()));
+                0.0));
         Measurement[] modelMeasurements = getModelMeasurementsImpl();
         if (modelMeasurements != null) {
             measurementList.addAll(Arrays.asList(modelMeasurements));
diff --git a/moa/src/main/java/moa/tasks/EvaluatePrequential.java b/moa/src/main/java/moa/tasks/EvaluatePrequential.java
index 8003489..16b51c8 100644
--- a/moa/src/main/java/moa/tasks/EvaluatePrequential.java
+++ b/moa/src/main/java/moa/tasks/EvaluatePrequential.java
@@ -213,7 +213,7 @@ public class EvaluatePrequential extends ClassificationMainTask implements Capab
                 long evaluateTime = TimingUtils.getNanoCPUTimeOfCurrentThread();
                 double time = TimingUtils.nanoTimeToSeconds(evaluateTime - evaluateStartTime);
                 double timeIncrement = TimingUtils.nanoTimeToSeconds(evaluateTime - lastEvaluateStartTime);
-                double RAMHoursIncrement = learner.measureByteSize() / (1024.0 * 1024.0 * 1024.0); //GBs
+                double RAMHoursIncrement = 0.0 / (1024.0 * 1024.0 * 1024.0); //GBs
                 RAMHoursIncrement *= (timeIncrement / 3600.0); //Hours
                 RAMHours += RAMHoursIncrement;
                 lastEvaluateStartTime = evaluateTime;

Task completed in 3m45s (CPU time)

We could pass the already calculated byte size to getModelMeasurementsImpl()

Same happens with EvaluateInterleavedTestThenTrain as well

How to run the tests
test.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant