When you submit command jobs, your code is uploaded to FloydHub and training is started in the cloud. You can close the browser or shutdown your machine and the job running in the cloud will not be affected. You can monitor the job anytime from the website and view the logs and metrics in realtime. When the training script ends it will be shutdown automatically by FloydHub.
Workspaces are interactive development environment. You can run jupyter notebooks and python scripts here. When running jupyter notebooks, you have to keep your browser tab open and connected to the internet to receive the output of your code cells. If you close your browser when your code cell is executing, you will lose the output printed by the cell but the code will continue to run. When you re-open the browser tab, the output of the code cell will be empty.
This is an artifact of Jupyter Notebooks:
"Anything already running in the notebook will keep running, and the kernel it started for that will stay running - so it won't lose your variables. However, any output produced while the notebook isn't open in a browser tab is lost; there isn't an easy way to change this until we have the notebook server able to track the document state, which has been on the plan for ages." - Jupyter Team
Here are some workarounds that you can use:
# Logging message to a log file (e.g. <your_notebook_name>.log)
# Inside the Code Cells
mylogger('End of Notebook')
Send metrics to Tensorboard - this way you will be able to view the training results later even if the code cell output is lost.
Convert your notebook to a python script and run the Job in command mode. You can also run a notebook directly from Terminal or as a CLI Job with the
$ jupyter nbconvert --execute --to notebook --inplace --ExecutePreprocessor.timeout=-1 <notebook_to_execute>
# CLI Job
$ floyd run ... 'jupyter nbconvert --execute --to notebook --inplace --ExecutePreprocessor.timeout=-1 <notebook_to_execute>'
This command will executed your notebook and replace the output of the notebook with the new one at the end of the execution. The timeout was disabled for simplicity.
For more about
nbconverter see the related docs.