NB_07 Food101-Final on 2019 MacPro #643

jeffsnell · 2024-04-19T00:19:00Z

jeffsnell
Apr 19, 2024

I am working on implementing the Final Food-101 Model, to beat the published paper results, using an Apple MacPro 2019 - Intel 3.2 Ghz 16-core Xeon W and AMD Radeon Pro Vega II 32 GB - using Apple-Metal GPU library.

I have been able to load/run with:
python==3.10
tensorflow==2.13
tensorflow-hub==0.16.0
tensorflow-datasets==4.9.3
tensorflow-metal==1.0.0. (I have only been able to load 1.1.0 on M* based Macs)

I reached accuracy of 77.6% - I could probably get better, but I am moving on with the course for now :)

For buffer_size and batch_size I experimented with different values to see what the fit/training performance might be on the 07_efficientnetb0_feature_extract_model_mixed_precision model.

I find that I am able to run with mixed-precision enabled. I used <Policy "mixed_bfloat16"> because, although the AMD GPU is not supported for mixed precision - the INTEL Xeon is, for the CPU operations and bfloat16 is the preferred type for that processor according to what I could find online.

Rough (by eye) average over 3 epochs:

BATCH SIZE	BUFFER SIZE	TIME PER EPOC
10	250	1320s
16	500	998s
32	1000	650s
80	2500	480s
128	2500 *	459s

I kept the 2500 BUFFER_SIZE to keep the CPU RAM use from expanding any further.

The only GPU performance measure available to me is GPU "Load" the last/bottom row generates the highest CPU & GPU load and shortest time per epoch

@mrdbourke - is there a down-side to running these larger BATCH_SIZE and BUFFER_SIZE values? I do not detect any effect on training/evaluation of models values.

As to the Final Food-101 Mode - I am not there yet...

My first (naive) attempt was not bad, and gave me a starting point.
I next tried halving the (fixed) training rate - that was a bit better
I next tried the variable training rate - starting at the halved value - and that was a bit better still, but "no cigar"
Now I am running with some data augmentation on the training data and the val_accuracy is keeping up better with the training, but it is still running, so we shall see...

NOTE - I Found this on Batch Size:
What is Batch Size?

Batch size is one of the most important hyperparameters in deep learning training, and it represents the number of samples used in one forward and backward pass through the network and has a direct impact on the accuracy and computational efficiency of the training process. The batch size can be understood as a trade-off between accuracy and speed. Large batch sizes can lead to faster training times but may result in lower accuracy and overfitting, while smaller batch sizes can provide better accuracy, but can be computationally expensive and time-consuming.

The batch size can also affect the convergence of the model, meaning that it can influence the optimization process and the speed at which the model learns. Small batch sizes can be more susceptible to random fluctuations in the training data, while larger batch sizes are more resistant to these fluctuations but may converge more slowly.

It is important to note that there is no one-size-fits-all answer when it comes to choosing a batch size, as the ideal size will depend on several factors, including the size of the training dataset, the complexity of the model, and the computational resources available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NB_07 Food101-Final on 2019 MacPro #643

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

NB_07 Food101-Final on 2019 MacPro #643

jeffsnell Apr 19, 2024

Replies: 0 comments

jeffsnell
Apr 19, 2024