在远程连接 Ubuntu 服务器进行模型训练时,遇到了 ssh 卡死导致训练中断,只能重新开始,有点浪费资源.

为了避免因为服务器链接中断,或者退出 ssh 后,出现Ubuntu 服务器上的程序终止,或模型训练停止的问题,推荐采用 screen命令管理工具.

screen安装:

sudo apt-get install screen

screen 使用:

ssh username@ip 
# 例如
cd /path/to/python_demo
screen python demo.py

screen 常用命令:

[1] - 创建新进程,如进程名为 test_screen:

screen -S test_screen

[2] - 打印当前所有的 screen进程:

screen -ls

输出如:

There is a screen on:
    31031.pts-8.mk-SYS-7048GR-TR    (2019年01月03日 11时00分07秒)    (Detached)
1 Socket in /var/run/screen/S-hgf.

[3] - 进入创建的 test_screen 进程:

screen -r test_screen

[4] - screen 窗口切换快捷键:ctrl + a + d.

[5] - 进入指定 IDscreen进程,继续程序:

screen -r ID 
# 例如:
screen -r 31031

注: 如果报错 There is no screen to be resumed matching xxx,则,

screen -d ID
screen -r ID

[6] - 彻底退出 screen 进程:

screen程序执行结束后,进入进程后,执行 exit即可.

screen -r test_screen
exit

[7] - 查看帮助信息

screen -h

如:

Use: screen [-opts] [cmd [args]]
 or: screen -r [host.tty]

Options:
-4            Resolve hostnames only to IPv4 addresses.
-6            Resolve hostnames only to IPv6 addresses.
-a            Force all capabilities into each window's termcap.
-A -[r|R]     Adapt all windows to the new display width & height.
-c file       Read configuration file instead of '.screenrc'.
-d (-r)       Detach the elsewhere running screen (and reattach here).
-dmS name     Start as daemon: Screen session in detached mode.
-D (-r)       Detach and logout remote (and reattach here).
-D -RR        Do whatever is needed to get a screen session.
-e xy         Change command characters.
-f            Flow control on, -fn = off, -fa = auto.
-h lines      Set the size of the scrollback history buffer.
-i            Interrupt output sooner when flow control is on.
-l            Login mode on (update /var/run/utmp), -ln = off.
-ls [match]   or
-list         Do nothing, just list our SockDir [on possible matches].
-L            Turn on output logging.
-m            ignore $STY variable, do create a new screen session.
-O            Choose optimal output rather than exact vt100 emulation.
-p window     Preselect the named window if it exists.
-q            Quiet startup. Exits with non-zero return code if unsuccessful.
-Q            Commands will send the response to the stdout of the querying process.
-r [session]  Reattach to a detached screen process.
-R            Reattach if possible, otherwise start a new session.
-s shell      Shell to execute rather than $SHELL.
-S sockname   Name this session <pid>.sockname instead of <pid>.<tty>.<host>.
-t title      Set title. (window's name).
-T term       Use term as $TERM for windows, rather than "screen".
-U            Tell screen to use UTF-8 encoding.
-v            Print "Screen version 4.03.01 (GNU) 28-Jun-15".
-wipe [match] Do nothing, just clean up SockDir [on possible matches].
-x            Attach to a not detached screen. (Multi display mode).
-X            Execute <cmd> as a screen command in the specified session.

问题解决:

[1] - screen 状态为Attached 连不上

问题: 用 screen -ls, 显示当前状态为Attached, 但当前没有用户登陆些会话. screen此时正常状态应该为(Detached), 此时用 screen -r session-id,怎么也登不上.

解决方法:

screen -D  -r <session-id>

-D -r 先踢掉前一用户,再登陆.

Last modification:March 10th, 2020 at 10:59 am