Ubuntu - wget 下载工具

Author： AIHGF
发布时间：December 14, 2018
148views
No comments
16532 words
Categories：操作系统

wget - "World Wide Web" 与 "get"，是一个从网络上自动下载文件的自由工具，支持通过HTTP、HTTPS、FTP 三个常见的TCP/IP协议下载，并可以使用HTTP代理.

wget 可以跟踪 HTML 页面上的链接依次下载来创建远程服务器的本地版本，完全重建原始站点的目录结构. 这常称作 "递归下载".

wget 可以在下载时，同时将链接转换成指向本地文件，以方便离线浏览.

wget 非常稳定，即使在带宽很窄和网络不稳定时，适应性也很好.

如果是由于网络原因下载失败，wget 会不断的尝试，直到整个文件下载完毕.

如果是服务器打断下载过程，wget 会再次联到服务器上从停止的地方继续下载(断点下载). 对于限定了链接时间的服务器上下载大文件非常有用.

wget 功能强大，且使用比较简单.

wget命令

1. 常见 wget 命令

wget 用法：

wget(选项)(URL)

常用 (选项)： (大小写敏感)

-a<日志文件>：在指定的日志文件中记录资料的执行过程； 
            –append-output=FILE 把记录追加到FILE文件中；
-A<后缀名>：指定要下载文件的后缀名，多个后缀名之间使用逗号进行分隔；
-b：进行后台的方式运行wget；
-B<连接地址>：设置参考的连接地址的基地地址；
-c：继续执行上次终端的任务；断点续传
-C<标志>：设置服务器数据块功能标志on为激活，off为关闭，默认值为on；
-d：调试模式运行指令；
-D<域名列表>：设置顺着的域名列表，域名之间用“，”分隔；
-e<指令>：作为文件“.wgetrc”中的一部分执行指定的指令；
-h：显示指令帮助信息；
-i<文件>：从指定文件获取要下载的URL地址；
-k：表示将下载的网页里的链接修改为本地链接；
-l<目录列表>：设置顺着的目录列表，多个目录用“，”分隔；
-L：仅顺着关联的连接；
-m：–mirror 等价于 -r -N -l inf -nr；
-nc：文件存在时，下载文件不覆盖原有文件；
-nd：递归下载时不创建一层一层的目录，把所有的文件下载到当前目录；
-nh：不查询主机名称；
-np：不要追溯到父目录
-nv：下载时只显示更新和出错信息，不显示指令的详细执行过程；
-N：不要重新下载文件除非比本地文件新；
-o：将log日志指定保存到文件(新建一个文件)；
-p：获得所有显示网页所需的元素；
-P：指定下载目录；
-q：不显示指令执行过程；
-r：递归下载；
-v：显示详细执行过程；
-V：显示版本信息；
--passive-ftp：使用被动模式PASV连接FTP服务器；
--follow-ftp：从HTML文件中下载FTP连接文件。

2. 实例

2.1 下载单个文件

wget http://www.linuxde.net/testfile.zip

从网络下载一个文件并保存在当前目录，在下载的过程中会显示进度条，包含（下载完成百分比，已经下载的字节，当前下载速度，剩余下载时间）.

2.2 下载并以不同的文件名保存

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

wget 默认会以最后一个符合/的后面的字符来命令，对于动态链接的下载通常文件名会不正确.

错误：下面的例子会下载一个文件并以名称download.aspx?id=1080保存:

wget http://www.linuxde.net/download?id=1

即使下载的文件是zip格式，它仍然以download.php?id=1080命令.

正确：为了解决这个问题，我们可以使用参数-O来指定一个文件名：

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

2.3 限速下载

wget --limit-rate=300k http://www.linuxde.net/testfile.zip

wget 默认会占用全部可能的宽带下载.

但是当准备下载一个大文件，而还需要下载其它文件时就有必要限速了.

2.4 断点续传

wget -c http://www.linuxde.net/testfile.zip

使用wget -c重新启动下载中断的文件，有利于下载大文件.

当网络等原因突然中断下载时，可以继续接着下载而不是重新下载一个文件.

2.5 后台下载

wget -b http://www.linuxde.net/testfile.zip

Continuing in background, pid 1840.
Output will be written to `wget-log'.

对于下载非常大的文件的时候，可以使用参数-b进行后台下载，可以使用以下命令来察看下载进度：

tail -f wget-log

2.6 伪装代理名称下载

wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" \
    http://www.linuxde.net/testfile.zip

有些网站能通过根据判断代理名称不是浏览器而拒绝下载请求.

不过可以通过--user-agent参数伪装.

2.7 测试下载链接

当打算进行定时下载，应该在预定时间测试下载链接是否有效.

可以增加--spider参数进行检查.

wget --spider URL

如果下载链接正确，将会显示:

Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

这保证了下载能在预定的时间进行，但当给错了一个链接时，将会显示如下错误:

wget --spider url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!

可以在以下几种情况下使用--spider参数：

定时下载之前进行检查
间隔检测网站是否可用
检查网站页面的死链接

2.8 增加重试次数

wget --tries=40 URL

如果网络有问题或下载一个大文件也有可能失败.

wget默认重试20次连接下载文件.

如果需要，可以使用--tries增加重试次数.

2.9 下载多个文件

wget -i filelist.txt

首先，保存一份下载链接文件：

cat > filelist.txt
url1
url2
url3
url4

接着使用这个文件和参数-i下载。

2.10 镜像网站

wget --mirror -p --convert-links -P ./LOCAL URL

下载整个网站到本地.

--miror开户镜像下载.
-p下载所有为了html页面显示正常的文件.
--convert-links下载后，转换成本地的链接.
-P ./LOCAL保存所有文件和目录到本地指定目录.

2.11 过滤指定格式下载

wget --reject=gif ur

下载一个网站，但不希望下载图片，可以使用这条命令.

2.12 把下载信息存入日志文件

wget -o download.log URL

不希望下载信息直接显示在终端而是在一个日志文件，可以使用.

2.13 限制总下载文件大小

wget -Q5m -i filelist.txt

当想要下载的文件超过 5M 而退出下载.

注意：这个参数对单个文件下载不起作用，只能递归下载时才有效.

2.14 下载指定格式文件

wget -r -A.pdf url

可以在以下情况使用该功能：

下载一个网站的所有图片.
下载一个网站的所有视频.
下载一个网站的所有PDF文件.

2.15 FTP下载

wget ftp-url
wget --ftp-user=USERNAME --ftp-password=PASSWORD url

可以使用wget来完成ftp链接的下载.

使用wget匿名ftp下载：

wget ftp-url

使用 wget 用户名和密码认证的ftp下载：

wget --ftp-user=USERNAME --ftp-password=PASSWORD url

3. wget -h 全部命令

wget -h # 显示帮助选项，输出可选选项

Startup:
  -V,  --version                   display the version of Wget and exit
  -h,  --help                      print this help
  -b,  --background                go to background after startup
  -e,  --execute=COMMAND           execute a `.wgetrc'-style command

Logging and input file:
  -o,  --output-file=FILE          log messages to FILE
  -a,  --append-output=FILE        append messages to FILE
  -d,  --debug                     print lots of debugging information
  -q,  --quiet                     quiet (no output)
  -v,  --verbose                   be verbose (this is the default)
  -nv, --no-verbose                turn off verboseness, without being quiet
       --report-speed=TYPE         output bandwidth as TYPE.  TYPE can be bits
  -i,  --input-file=FILE           download URLs found in local or external FILE
  -F,  --force-html                treat input file as HTML
  -B,  --base=URL                  resolves HTML input-file links (-i -F)
                                     relative to URL
       --config=FILE               specify config file to use
       --no-config                 do not read any config file
       --rejected-log=FILE         log reasons for URL rejection to FILE

Download:
  -t,  --tries=NUMBER              set number of retries to NUMBER (0 unlimits)
       --retry-connrefused         retry even if connection is refused
  -O,  --output-document=FILE      write documents to FILE
  -nc, --no-clobber                skip downloads that would download to
                                     existing files (overwriting them)
  -c,  --continue                  resume getting a partially-downloaded file
       --start-pos=OFFSET          start downloading from zero-based position OFFSET
       --progress=TYPE             select progress gauge type
       --show-progress             display the progress bar in any verbosity mode
  -N,  --timestamping              don't re-retrieve files unless newer than
                                     local
       --no-if-modified-since      don't use conditional if-modified-since get
                                     requests in timestamping mode
       --no-use-server-timestamps  don't set the local file's timestamp by
                                     the one on the server
  -S,  --server-response           print server response
       --spider                    don't download anything
  -T,  --timeout=SECONDS           set all timeout values to SECONDS
       --dns-timeout=SECS          set the DNS lookup timeout to SECS
       --connect-timeout=SECS      set the connect timeout to SECS
       --read-timeout=SECS         set the read timeout to SECS
  -w,  --wait=SECONDS              wait SECONDS between retrievals
       --waitretry=SECONDS         wait 1..SECONDS between retries of a retrieval
       --random-wait               wait from 0.5*WAIT...1.5*WAIT secs between retrievals
       --no-proxy                  explicitly turn off proxy
  -Q,  --quota=NUMBER              set retrieval quota to NUMBER
       --bind-address=ADDRESS      bind to ADDRESS (hostname or IP) on local host
       --limit-rate=RATE           limit download rate to RATE
       --no-dns-cache              disable caching DNS lookups
       --restrict-file-names=OS    restrict chars in file names to ones OS allows
       --ignore-case               ignore case when matching files/directories
  -4,  --inet4-only                connect only to IPv4 addresses
  -6,  --inet6-only                connect only to IPv6 addresses
       --prefer-family=FAMILY      connect first to addresses of specified family,
                                     one of IPv6, IPv4, or none
       --user=USER                 set both ftp and http user to USER
       --password=PASS             set both ftp and http password to PASS
       --ask-password              prompt for passwords
       --no-iri                    turn off IRI support
       --local-encoding=ENC        use ENC as the local encoding for IRIs
       --remote-encoding=ENC       use ENC as the default remote encoding
       --unlink                    remove file before clobber

Directories:
  -nd, --no-directories            don't create directories
  -x,  --force-directories         force creation of directories
  -nH, --no-host-directories       don't create host directories
       --protocol-directories      use protocol name in directories
  -P,  --directory-prefix=PREFIX   save files to PREFIX/..
       --cut-dirs=NUMBER           ignore NUMBER remote directory components

HTTP options:
       --http-user=USER            set http user to USER
       --http-password=PASS        set http password to PASS
       --no-cache                  disallow server-cached data
       --default-page=NAME         change the default page name (normally
                                     this is 'index.html'.)
  -E,  --adjust-extension          save HTML/CSS documents with proper extensions
       --ignore-length             ignore 'Content-Length' header field
       --header=STRING             insert STRING among the headers
       --max-redirect              maximum redirections allowed per page
       --proxy-user=USER           set USER as proxy username
       --proxy-password=PASS       set PASS as proxy password
       --referer=URL               include 'Referer: URL' header in HTTP request
       --save-headers              save the HTTP headers to file
  -U,  --user-agent=AGENT          identify as AGENT instead of Wget/VERSION
       --no-http-keep-alive        disable HTTP keep-alive (persistent connections)
       --no-cookies                don't use cookies
       --load-cookies=FILE         load cookies from FILE before session
       --save-cookies=FILE         save cookies to FILE after session
       --keep-session-cookies      load and save session (non-permanent) cookies
       --post-data=STRING          use the POST method; send STRING as the data
       --post-file=FILE            use the POST method; send contents of FILE
       --method=HTTPMethod         use method "HTTPMethod" in the request
       --body-data=STRING          send STRING as data. --method MUST be set
       --body-file=FILE            send contents of FILE. --method MUST be set
       --content-disposition       honor the Content-Disposition header when
                                     choosing local file names (EXPERIMENTAL)
       --content-on-error          output the received content on server errors
       --auth-no-challenge         send Basic HTTP authentication information
                                     without first waiting for the server's
                                     challenge

HTTPS (SSL/TLS) options:
       --secure-protocol=PR        choose secure protocol, one of auto, SSLv2,
                                     SSLv3, TLSv1 and PFS
       --https-only                only follow secure HTTPS links
       --no-check-certificate      don't validate the server's certificate
       --certificate=FILE          client certificate file
       --certificate-type=TYPE     client certificate type, PEM or DER
       --private-key=FILE          private key file
       --private-key-type=TYPE     private key type, PEM or DER
       --ca-certificate=FILE       file with the bundle of CAs
       --ca-directory=DIR          directory where hash list of CAs is stored
       --crl-file=FILE             file with bundle of CRLs
       --random-file=FILE          file with random data for seeding the SSL PRNG
       --egd-file=FILE             file naming the EGD socket with random data

HSTS options:
       --no-hsts                   disable HSTS
       --hsts-file                 path of HSTS database (will override default)

FTP options:
       --ftp-user=USER             set ftp user to USER
       --ftp-password=PASS         set ftp password to PASS
       --no-remove-listing         don't remove '.listing' files
       --no-glob                   turn off FTP file name globbing
       --no-passive-ftp            disable the "passive" transfer mode
       --preserve-permissions      preserve remote file permissions
       --retr-symlinks             when recursing, get linked-to files (not dir)

FTPS options:
       --ftps-implicit                 use implicit FTPS (default port is 990)
       --ftps-resume-ssl               resume the SSL/TLS session started in the control connection when
                                         opening a data connection
       --ftps-clear-data-connection    cipher the control channel only; all the data will be in plaintext
       --ftps-fallback-to-ftp          fall back to FTP if FTPS is not supported in the target server
WARC options:
       --warc-file=FILENAME        save request/response data to a .warc.gz file
       --warc-header=STRING        insert STRING into the warcinfo record
       --warc-max-size=NUMBER      set maximum size of WARC files to NUMBER
       --warc-cdx                  write CDX index files
       --warc-dedup=FILENAME       do not store records listed in this CDX file
       --no-warc-compression       do not compress WARC files with GZIP
       --no-warc-digests           do not calculate SHA1 digests
       --no-warc-keep-log          do not store the log file in a WARC record
       --warc-tempdir=DIRECTORY    location for temporary files created by the
                                     WARC writer

Recursive download:
  -r,  --recursive                 specify recursive download
  -l,  --level=NUMBER              maximum recursion depth (inf or 0 for infinite)
       --delete-after              delete files locally after downloading them
  -k,  --convert-links             make links in downloaded HTML or CSS point to
                                     local files
       --convert-file-only         convert the file part of the URLs only (usually known as the basename)
       --backups=N                 before writing file X, rotate up to N backup files
  -K,  --backup-converted          before converting file X, back up as X.orig
  -m,  --mirror                    shortcut for -N -r -l inf --no-remove-listing
  -p,  --page-requisites           get all images, etc. needed to display HTML page
       --strict-comments           turn on strict (SGML) handling of HTML comments

Recursive accept/reject:
  -A,  --accept=LIST               comma-separated list of accepted extensions
  -R,  --reject=LIST               comma-separated list of rejected extensions
       --accept-regex=REGEX        regex matching accepted URLs
       --reject-regex=REGEX        regex matching rejected URLs
       --regex-type=TYPE           regex type (posix|pcre)
  -D,  --domains=LIST              comma-separated list of accepted domains
       --exclude-domains=LIST      comma-separated list of rejected domains
       --follow-ftp                follow FTP links from HTML documents
       --follow-tags=LIST          comma-separated list of followed HTML tags
       --ignore-tags=LIST          comma-separated list of ignored HTML tags
  -H,  --span-hosts                go to foreign hosts when recursive
  -L,  --relative                  follow relative links only
  -I,  --include-directories=LIST  list of allowed directories
       --trust-server-names        use the name specified by the redirection
                                     URL's last component
  -X,  --exclude-directories=LIST  list of excluded directories
  -np, --no-parent                 don't ascend to the parent directory

Last modification：December 14th, 2018 at 03:30 pm

暂无相关推荐

Ubuntu - wget 下载工具

AIHGF • 2018 年 12 月 14 日

wget - "World Wide Web" 与 "get"，是一个从网络上自动下载文件的自由工具，支持通过HTTP、HTTPS、FTP 三个常见的TCP/IP协议下载，并可以使用HTTP代理.

wget 可以跟踪 HTML 页面上的链接依次下载来创建远程服务器的本地版本，完全重建原始站点的目录结构. 这常称作 "递归下载".

wget 可以在下载时，同时将链接转换成指向本地文件，以方便离线浏览.

wget 非常稳定，即使在带宽很窄和网络不稳定时，适应性也很好.

如果是由于网络原因下载失败，wget 会不断的尝试，直到整个文件下载完毕.

如果是服务器打断下载过程，wget 会再次联到服务器上从停止的地方继续下载(断点下载). 对于限定了链接时间的服务器上下载大文件非常有用.

wget 功能强大，且使用比较简单.

wget命令

1. 常见 wget 命令

wget 用法：

wget(选项)(URL)

常用 (选项)： (大小写敏感)

-a<日志文件>：在指定的日志文件中记录资料的执行过程； 
            –append-output=FILE 把记录追加到FILE文件中；
-A<后缀名>：指定要下载文件的后缀名，多个后缀名之间使用逗号进行分隔；
-b：进行后台的方式运行wget；
-B<连接地址>：设置参考的连接地址的基地地址；
-c：继续执行上次终端的任务；断点续传
-C<标志>：设置服务器数据块功能标志on为激活，off为关闭，默认值为on；
-d：调试模式运行指令；
-D<域名列表>：设置顺着的域名列表，域名之间用“，”分隔；
-e<指令>：作为文件“.wgetrc”中的一部分执行指定的指令；
-h：显示指令帮助信息；
-i<文件>：从指定文件获取要下载的URL地址；
-k：表示将下载的网页里的链接修改为本地链接；
-l<目录列表>：设置顺着的目录列表，多个目录用“，”分隔；
-L：仅顺着关联的连接；
-m：–mirror 等价于 -r -N -l inf -nr；
-nc：文件存在时，下载文件不覆盖原有文件；
-nd：递归下载时不创建一层一层的目录，把所有的文件下载到当前目录；
-nh：不查询主机名称；
-np：不要追溯到父目录
-nv：下载时只显示更新和出错信息，不显示指令的详细执行过程；
-N：不要重新下载文件除非比本地文件新；
-o：将log日志指定保存到文件(新建一个文件)；
-p：获得所有显示网页所需的元素；
-P：指定下载目录；
-q：不显示指令执行过程；
-r：递归下载；
-v：显示详细执行过程；
-V：显示版本信息；
--passive-ftp：使用被动模式PASV连接FTP服务器；
--follow-ftp：从HTML文件中下载FTP连接文件。

2. 实例

2.1 下载单个文件

wget http://www.linuxde.net/testfile.zip

从网络下载一个文件并保存在当前目录，在下载的过程中会显示进度条，包含（下载完成百分比，已经下载的字节，当前下载速度，剩余下载时间）.

2.2 下载并以不同的文件名保存

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

wget 默认会以最后一个符合/的后面的字符来命令，对于动态链接的下载通常文件名会不正确.

错误：下面的例子会下载一个文件并以名称download.aspx?id=1080保存:

wget http://www.linuxde.net/download?id=1

即使下载的文件是zip格式，它仍然以download.php?id=1080命令.

正确：为了解决这个问题，我们可以使用参数-O来指定一个文件名：

wget -O wordpress.zip http://www.linuxde.net/download.aspx?id=1080

2.3 限速下载

wget --limit-rate=300k http://www.linuxde.net/testfile.zip

wget 默认会占用全部可能的宽带下载.

但是当准备下载一个大文件，而还需要下载其它文件时就有必要限速了.

2.4 断点续传

wget -c http://www.linuxde.net/testfile.zip

使用wget -c重新启动下载中断的文件，有利于下载大文件.

当网络等原因突然中断下载时，可以继续接着下载而不是重新下载一个文件.

2.5 后台下载

wget -b http://www.linuxde.net/testfile.zip

Continuing in background, pid 1840.
Output will be written to `wget-log'.

对于下载非常大的文件的时候，可以使用参数-b进行后台下载，可以使用以下命令来察看下载进度：

tail -f wget-log

2.6 伪装代理名称下载

wget --user-agent="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16" \
    http://www.linuxde.net/testfile.zip

有些网站能通过根据判断代理名称不是浏览器而拒绝下载请求.

不过可以通过--user-agent参数伪装.

2.7 测试下载链接

当打算进行定时下载，应该在预定时间测试下载链接是否有效.

可以增加--spider参数进行检查.

wget --spider URL

如果下载链接正确，将会显示:

Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

这保证了下载能在预定的时间进行，但当给错了一个链接时，将会显示如下错误:

wget --spider url
Spider mode enabled. Check if remote file exists.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!

可以在以下几种情况下使用--spider参数：

定时下载之前进行检查
间隔检测网站是否可用
检查网站页面的死链接

2.8 增加重试次数

wget --tries=40 URL

如果网络有问题或下载一个大文件也有可能失败.

wget默认重试20次连接下载文件.

如果需要，可以使用--tries增加重试次数.

2.9 下载多个文件

wget -i filelist.txt

首先，保存一份下载链接文件：

cat > filelist.txt
url1
url2
url3
url4

接着使用这个文件和参数-i下载。

2.10 镜像网站

wget --mirror -p --convert-links -P ./LOCAL URL

下载整个网站到本地.

--miror开户镜像下载.
-p下载所有为了html页面显示正常的文件.
--convert-links下载后，转换成本地的链接.
-P ./LOCAL保存所有文件和目录到本地指定目录.

2.11 过滤指定格式下载

wget --reject=gif ur

下载一个网站，但不希望下载图片，可以使用这条命令.

2.12 把下载信息存入日志文件

wget -o download.log URL

不希望下载信息直接显示在终端而是在一个日志文件，可以使用.

2.13 限制总下载文件大小

wget -Q5m -i filelist.txt

当想要下载的文件超过 5M 而退出下载.

注意：这个参数对单个文件下载不起作用，只能递归下载时才有效.

2.14 下载指定格式文件

wget -r -A.pdf url

可以在以下情况使用该功能：

下载一个网站的所有图片.
下载一个网站的所有视频.
下载一个网站的所有PDF文件.

2.15 FTP下载

wget ftp-url
wget --ftp-user=USERNAME --ftp-password=PASSWORD url

可以使用wget来完成ftp链接的下载.

使用wget匿名ftp下载：

wget ftp-url

使用 wget 用户名和密码认证的ftp下载：

wget --ftp-user=USERNAME --ftp-password=PASSWORD url

3. wget -h 全部命令

wget -h # 显示帮助选项，输出可选选项

Startup:
  -V,  --version                   display the version of Wget and exit
  -h,  --help                      print this help
  -b,  --background                go to background after startup
  -e,  --execute=COMMAND           execute a `.wgetrc'-style command

Logging and input file:
  -o,  --output-file=FILE          log messages to FILE
  -a,  --append-output=FILE        append messages to FILE
  -d,  --debug                     print lots of debugging information
  -q,  --quiet                     quiet (no output)
  -v,  --verbose                   be verbose (this is the default)
  -nv, --no-verbose                turn off verboseness, without being quiet
       --report-speed=TYPE         output bandwidth as TYPE.  TYPE can be bits
  -i,  --input-file=FILE           download URLs found in local or external FILE
  -F,  --force-html                treat input file as HTML
  -B,  --base=URL                  resolves HTML input-file links (-i -F)
                                     relative to URL
       --config=FILE               specify config file to use
       --no-config                 do not read any config file
       --rejected-log=FILE         log reasons for URL rejection to FILE

Download:
  -t,  --tries=NUMBER              set number of retries to NUMBER (0 unlimits)
       --retry-connrefused         retry even if connection is refused
  -O,  --output-document=FILE      write documents to FILE
  -nc, --no-clobber                skip downloads that would download to
                                     existing files (overwriting them)
  -c,  --continue                  resume getting a partially-downloaded file
       --start-pos=OFFSET          start downloading from zero-based position OFFSET
       --progress=TYPE             select progress gauge type
       --show-progress             display the progress bar in any verbosity mode
  -N,  --timestamping              don't re-retrieve files unless newer than
                                     local
       --no-if-modified-since      don't use conditional if-modified-since get
                                     requests in timestamping mode
       --no-use-server-timestamps  don't set the local file's timestamp by
                                     the one on the server
  -S,  --server-response           print server response
       --spider                    don't download anything
  -T,  --timeout=SECONDS           set all timeout values to SECONDS
       --dns-timeout=SECS          set the DNS lookup timeout to SECS
       --connect-timeout=SECS      set the connect timeout to SECS
       --read-timeout=SECS         set the read timeout to SECS
  -w,  --wait=SECONDS              wait SECONDS between retrievals
       --waitretry=SECONDS         wait 1..SECONDS between retries of a retrieval
       --random-wait               wait from 0.5*WAIT...1.5*WAIT secs between retrievals
       --no-proxy                  explicitly turn off proxy
  -Q,  --quota=NUMBER              set retrieval quota to NUMBER
       --bind-address=ADDRESS      bind to ADDRESS (hostname or IP) on local host
       --limit-rate=RATE           limit download rate to RATE
       --no-dns-cache              disable caching DNS lookups
       --restrict-file-names=OS    restrict chars in file names to ones OS allows
       --ignore-case               ignore case when matching files/directories
  -4,  --inet4-only                connect only to IPv4 addresses
  -6,  --inet6-only                connect only to IPv6 addresses
       --prefer-family=FAMILY      connect first to addresses of specified family,
                                     one of IPv6, IPv4, or none
       --user=USER                 set both ftp and http user to USER
       --password=PASS             set both ftp and http password to PASS
       --ask-password              prompt for passwords
       --no-iri                    turn off IRI support
       --local-encoding=ENC        use ENC as the local encoding for IRIs
       --remote-encoding=ENC       use ENC as the default remote encoding
       --unlink                    remove file before clobber

Directories:
  -nd, --no-directories            don't create directories
  -x,  --force-directories         force creation of directories
  -nH, --no-host-directories       don't create host directories
       --protocol-directories      use protocol name in directories
  -P,  --directory-prefix=PREFIX   save files to PREFIX/..
       --cut-dirs=NUMBER           ignore NUMBER remote directory components

HTTP options:
       --http-user=USER            set http user to USER
       --http-password=PASS        set http password to PASS
       --no-cache                  disallow server-cached data
       --default-page=NAME         change the default page name (normally
                                     this is 'index.html'.)
  -E,  --adjust-extension          save HTML/CSS documents with proper extensions
       --ignore-length             ignore 'Content-Length' header field
       --header=STRING             insert STRING among the headers
       --max-redirect              maximum redirections allowed per page
       --proxy-user=USER           set USER as proxy username
       --proxy-password=PASS       set PASS as proxy password
       --referer=URL               include 'Referer: URL' header in HTTP request
       --save-headers              save the HTTP headers to file
  -U,  --user-agent=AGENT          identify as AGENT instead of Wget/VERSION
       --no-http-keep-alive        disable HTTP keep-alive (persistent connections)
       --no-cookies                don't use cookies
       --load-cookies=FILE         load cookies from FILE before session
       --save-cookies=FILE         save cookies to FILE after session
       --keep-session-cookies      load and save session (non-permanent) cookies
       --post-data=STRING          use the POST method; send STRING as the data
       --post-file=FILE            use the POST method; send contents of FILE
       --method=HTTPMethod         use method "HTTPMethod" in the request
       --body-data=STRING          send STRING as data. --method MUST be set
       --body-file=FILE            send contents of FILE. --method MUST be set
       --content-disposition       honor the Content-Disposition header when
                                     choosing local file names (EXPERIMENTAL)
       --content-on-error          output the received content on server errors
       --auth-no-challenge         send Basic HTTP authentication information
                                     without first waiting for the server's
                                     challenge

HTTPS (SSL/TLS) options:
       --secure-protocol=PR        choose secure protocol, one of auto, SSLv2,
                                     SSLv3, TLSv1 and PFS
       --https-only                only follow secure HTTPS links
       --no-check-certificate      don't validate the server's certificate
       --certificate=FILE          client certificate file
       --certificate-type=TYPE     client certificate type, PEM or DER
       --private-key=FILE          private key file
       --private-key-type=TYPE     private key type, PEM or DER
       --ca-certificate=FILE       file with the bundle of CAs
       --ca-directory=DIR          directory where hash list of CAs is stored
       --crl-file=FILE             file with bundle of CRLs
       --random-file=FILE          file with random data for seeding the SSL PRNG
       --egd-file=FILE             file naming the EGD socket with random data

HSTS options:
       --no-hsts                   disable HSTS
       --hsts-file                 path of HSTS database (will override default)

FTP options:
       --ftp-user=USER             set ftp user to USER
       --ftp-password=PASS         set ftp password to PASS
       --no-remove-listing         don't remove '.listing' files
       --no-glob                   turn off FTP file name globbing
       --no-passive-ftp            disable the "passive" transfer mode
       --preserve-permissions      preserve remote file permissions
       --retr-symlinks             when recursing, get linked-to files (not dir)

FTPS options:
       --ftps-implicit                 use implicit FTPS (default port is 990)
       --ftps-resume-ssl               resume the SSL/TLS session started in the control connection when
                                         opening a data connection
       --ftps-clear-data-connection    cipher the control channel only; all the data will be in plaintext
       --ftps-fallback-to-ftp          fall back to FTP if FTPS is not supported in the target server
WARC options:
       --warc-file=FILENAME        save request/response data to a .warc.gz file
       --warc-header=STRING        insert STRING into the warcinfo record
       --warc-max-size=NUMBER      set maximum size of WARC files to NUMBER
       --warc-cdx                  write CDX index files
       --warc-dedup=FILENAME       do not store records listed in this CDX file
       --no-warc-compression       do not compress WARC files with GZIP
       --no-warc-digests           do not calculate SHA1 digests
       --no-warc-keep-log          do not store the log file in a WARC record
       --warc-tempdir=DIRECTORY    location for temporary files created by the
                                     WARC writer

Recursive download:
  -r,  --recursive                 specify recursive download
  -l,  --level=NUMBER              maximum recursion depth (inf or 0 for infinite)
       --delete-after              delete files locally after downloading them
  -k,  --convert-links             make links in downloaded HTML or CSS point to
                                     local files
       --convert-file-only         convert the file part of the URLs only (usually known as the basename)
       --backups=N                 before writing file X, rotate up to N backup files
  -K,  --backup-converted          before converting file X, back up as X.orig
  -m,  --mirror                    shortcut for -N -r -l inf --no-remove-listing
  -p,  --page-requisites           get all images, etc. needed to display HTML page
       --strict-comments           turn on strict (SGML) handling of HTML comments

Recursive accept/reject:
  -A,  --accept=LIST               comma-separated list of accepted extensions
  -R,  --reject=LIST               comma-separated list of rejected extensions
       --accept-regex=REGEX        regex matching accepted URLs
       --reject-regex=REGEX        regex matching rejected URLs
       --regex-type=TYPE           regex type (posix|pcre)
  -D,  --domains=LIST              comma-separated list of accepted domains
       --exclude-domains=LIST      comma-separated list of rejected domains
       --follow-ftp                follow FTP links from HTML documents
       --follow-tags=LIST          comma-separated list of followed HTML tags
       --ignore-tags=LIST          comma-separated list of ignored HTML tags
  -H,  --span-hosts                go to foreign hosts when recursive
  -L,  --relative                  follow relative links only
  -I,  --include-directories=LIST  list of allowed directories
       --trust-server-names        use the name specified by the redirection
                                     URL's last component
  -X,  --exclude-directories=LIST  list of excluded directories
  -np, --no-parent                 don't ascend to the parent directory

1. 常见 wget 命令

2. 实例

2.1 下载单个文件

2.2 下载并以不同的文件名保存

2.3 限速下载

2.4 断点续传

2.5 后台下载

2.6 伪装代理名称下载

2.7 测试下载链接

2.8 增加重试次数

2.9 下载多个文件

2.10 镜像网站

2.11 过滤指定格式下载

2.12 把下载信息存入日志文件

2.13 限制总下载文件大小

2.14 下载指定格式文件

2.15 FTP下载

3. wget -h 全部命令

※相关文章推荐※

※最新文章推荐※