Amazon S3 是Amazon网落服务(Amazon Web Services,AWS)提供的云存储。Amazon S3在众多第三方已经开发完成的商业服务或客户端软件之上,发布了一组网络服务接口。本教程描述怎样用Linux命令行访问Amazon S3云存储。
最著名的Amazon S3命令行客户端是用python写的s3cmd。作为一个简单的AWS S3命令行工具,s3cmd的思想是用于运行脚本化的cron任务,比如每天的备份工作。
s3cmd 使用介绍
一、安装
#在Ubuntu或者Debian上安装s3cm sudo apt-get install s3cmd #centos yum install s3cmd #gentoo emerge -av s3cmd #rpm安装 rpm -ivh http://s3tools.org/repo/RHEL_6/x86_64/s3cmd-1.0.0-4.1.x86_64.rpm #源码安装 git clone https://github.com/s3tools/s3cmd cd s3cmd python setup.py install
命令说明
Usage: s3cmd [options] COMMAND [parameters] S3cmd is a tool for managing objects in Amazon S3 storage. It allows for making and removing "buckets" and uploading, downloading and removing "objects" from these buckets. Options: -h, --help show this help message and exit --configure Invoke interactive (re)configuration tool. Optionally use as '--configure s3://some-bucket' to test access to a specific bucket instead of attempting to list them all. -c FILE, --config=FILE Config file name. Defaults to /home/mludvig/.s3cfg --dump-config Dump current configuration after parsing config files and command line options and exit. --access_key=ACCESS_KEY AWS Access Key --secret_key=SECRET_KEY AWS Secret Key -n, --dry-run Only show what should be uploaded or downloaded but don't actually do it. May still perform S3 requests to get bucket listings and other information though (only for file transfer commands) -e, --encrypt Encrypt files before uploading to S3. --no-encrypt Don't encrypt files. -f, --force Force overwrite and other dangerous operations. --continue Continue getting a partially downloaded file (only for [get] command). --continue-put Continue uploading partially uploaded files or multipart upload parts. Restarts/parts files that don't have matching size and md5. Skips files/parts that do. Note: md5sum checks are not always sufficient to check (part) file equality. Enable this at your own risk. --upload-id=UPLOAD_ID UploadId for Multipart Upload, in case you want continue an existing upload (equivalent to --continue- put) and there are multiple partial uploads. Use s3cmd multipart [URI] to see what UploadIds are associated with the given URI. --skip-existing Skip over files that exist at the destination (only for [get] and [sync] commands). -r, --recursive Recursive upload, download or removal. --check-md5 Check MD5 sums when comparing files for [sync]. (default) --no-check-md5 Do not check MD5 sums when comparing files for [sync]. Only size will be compared. May significantly speed up transfer but may also miss some changed files. -P, --acl-public Store objects with ACL allowing read for anyone. --acl-private Store objects with default ACL allowing access for you only. --acl-grant=PERMISSION:EMAIL or USER_CANONICAL_ID Grant stated permission to a given amazon user. Permission is one of: read, write, read_acp, write_acp, full_control, all --acl-revoke=PERMISSION:USER_CANONICAL_ID Revoke stated permission for a given amazon user. Permission is one of: read, write, read_acp, wr ite_acp, full_control, all -D NUM, --restore-days=NUM Number of days to keep restored file available (only for 'restore' command). --delete-removed Delete remote objects with no corresponding local file [sync] --no-delete-removed Don't delete remote objects. --delete-after Perform deletes after new uploads [sync] --delay-updates Put all updated files into place at end [sync] --max-delete=NUM Do not delete more than NUM files. [del] and [sync] --add-destination=ADDITIONAL_DESTINATIONS Additional destination for parallel uploads, in addition to last arg. May be repeated. --delete-after-fetch Delete remote objects after fetching to local file (only for [get] and [sync] commands). -p, --preserve Preserve filesystem attributes (mode, ownership, timestamps). Default for [sync] command. --no-preserve Don't store FS attributes --exclude=GLOB Filenames and paths matching GLOB will be excluded from sync --exclude-from=FILE Read --exclude GLOBs from FILE --rexclude=REGEXP Filenames and paths matching REGEXP (regular expression) will be excluded from sync --rexclude-from=FILE Read --rexclude REGEXPs from FILE --include=GLOB Filenames and paths matching GLOB will be included even if previously excluded by one of --(r)exclude(-from) patterns --include-from=FILE Read --include GLOBs from FILE --rinclude=REGEXP Same as --include but uses REGEXP (regular expression) instead of GLOB --rinclude-from=FILE Read --rinclude REGEXPs from FILE --ignore-failed-copy Don't exit unsuccessfully because of missing keys --files-from=FILE Read list of source-file names from FILE. Use - to read from stdin. --bucket-location=BUCKET_LOCATION Datacentre to create bucket in. As of now the datacenters are: US (default), EU, ap-northeast-1, ap- southeast-1, sa-east-1, us-west-1 and us-west-2 --reduced-redundancy, --rr Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv] --access-logging-target-prefix=LOG_TARGET_PREFIX Target prefix for access logs (S3 URI) (for [cfmodify] and [accesslog] commands) --no-access-logging Disable access logging (for [cfmodify] and [accesslog] commands) --default-mime-type=DEFAULT_MIME_TYPE Default MIME-type for stored objects. Application default is binary/octet-stream. -M, --guess-mime-type Guess MIME-type of files by their extension or mime magic. Fall back to default MIME-Type as specified by --default-mime-type option --no-guess-mime-type Don't guess MIME-type and use the default type instead. --no-mime-magic Don't use mime magic when guessing MIME-type. -m MIME/TYPE, --mime-type=MIME/TYPE Force MIME-type. Override both --default-mime-type and --guess-mime-type. --add-header=NAME:VALUE Add a given HTTP header to the upload request. Can be used multiple times. For instance set 'Expires' or 'Cache-Control' headers (or both) using this option. --server-side-encryption Specifies that server-side encryption will be used when putting objects. --encoding=ENCODING Override autodetected terminal and filesystem encoding (character set). Autodetected: UTF-8 --add-encoding-exts=EXTENSIONs Add encoding to these comma delimited extensions i.e. (css,js,html) when uploading to S3 ) --verbatim Use the S3 name as given on the command line. No pre- processing, encoding, etc. Use with caution! --disable-multipart Disable multipart upload on files bigger than --multipart-chunk-size-mb --multipart-chunk-size-mb=SIZE Size of each chunk of a multipart upload. Files bigger than SIZE are automatically uploaded as multithreaded- multipart, smaller files are uploaded using the traditional method. SIZE is in Mega-Bytes, default chunk size is 15MB, minimum allowed chunk size is 5MB, maximum is 5GB. --list-md5 Include MD5 sums in bucket listings (only for 'ls' command). -H, --human-readable-sizes Print sizes in human readable form (eg 1kB instead of 1234). --ws-index=WEBSITE_INDEX Name of index-document (only for [ws-create] command) --ws-error=WEBSITE_ERROR Name of error-document (only for [ws-create] command) --progress Display progress meter (default on TTY). --no-progress Don't display progress meter (default on non-TTY). --enable Enable given CloudFront distribution (only for [cfmodify] command) --disable Enable given CloudFront distribution (only for [cfmodify] command) --cf-invalidate Invalidate the uploaded filed in CloudFront. Also see [cfinval] command. --cf-invalidate-default-index When using Custom Origin and S3 static website, invalidate the default index file. --cf-no-invalidate-default-index-root When using Custom Origin and S3 static website, don't invalidate the path to the default index file. --cf-add-cname=CNAME Add given CNAME to a CloudFront distribution (only for [cfcreate] and [cfmodify] commands) --cf-remove-cname=CNAME Remove given CNAME from a CloudFront distribution (only for [cfmodify] command) --cf-comment=COMMENT Set COMMENT for a given CloudFront distribution (only for [cfcreate] and [cfmodify] commands) --cf-default-root-object=DEFAULT_ROOT_OBJECT Set the default root object to return when no object is specified in the URL. Use a relative path, i.e. default/index.html instead of /default/index.html or s3://bucket/default/index.html (only for [cfcreate] and [cfmodify] commands) -v, --verbose Enable verbose output. -d, --debug Enable debug output. --version Show s3cmd version (1.5.0-beta1) and exit. -F, --follow-symlinks Follow symbolic links as if they are regular files --cache-file=FILE Cache FILE containing local source MD5 values -q, --quiet Silence output on stdout
二、初始化
第一次运行s3cmd需要运行下面的命令做配置:
s3cmd –configure
它将会问你一系列问题:
- AWS S3的访问密钥和安全密钥
- 对AWS S3双向传输的加密密码和加密数据
- 为加密数据设定GPG程序的路径(例如,/usr/bin/gpg)
- 是否使用https协议
- 如果使用http代理,设定名字和端口
配置将以保存普通文本格式保存在 ~/.s3cfg.
chmod 600 ~/.s3cfg三、基本使用
#0.列出所有bucket的所有对象 s3cmd la #1.在你的账户中列出所有现有的bucket s3cmd ls #2.建立新的bucket icyboy s3cmd mb s3://icyboy #3.上传文件到现有的bucket s3cmd put 1.png 2.png 3.png s3://icyboy #上传文件的默认访问权限是私有的(private),就是只有你自己可以访问,使用正确的访问和安全密码即可。 #4.上传公开访问权限的文件到现有bucket s3cmd put --acl-public 4.png s3://icyboy #如果上传的文件授予公开访问权限,任何人在浏览器中都可以通过http://icyboy.s3.amazonaws.com/4.png 访问。 #5.查看一个现有bucket的内容 s3cmd ls s3://icyboy #6.下载现有bucket包含的文件(例如所有的.png文件) s3cmd get s3://icyboy/*.png #7.删除现有bucket中的文件 s3cmd del s3://icyboy/*.png #8.获取现有bucket的信息,包括存储位置和访问控制列表(ACL) s3cmd info s3://icyboy #9.在上传到现有的bucket之前,加密文件 s3cmd -e put encrypt.png s3://icyboy #当用s3cmd下载一个加密过的文件时,它会自动检测加密并在下载过程解密,因此下载和访问加密文件时,就像通常所做的一样 s3cmd get s3://icyboy/encrypt.png #10.删除现有的bucket s3cmd rb s3://icyboy #注意,你不能删除一个非空的bucket。 #11.查看bucket所有大小 s3cmd du s3://icyboy #12.拷贝 s3cmd cp s3://icyboy/1.txt s3://xupeng/1.txt_copy #13.移动 s3cmd mv s3://icyboy/1.txt s3://xupeng/1.txt_copy四、复杂使用
#1.上传文件夹 xupeng@icyboy ~ $ s3cmd put [--recursive|-r] dir1 s3://icyboy/some/path/ dir1/file1-1.txt -> s3://icyboy/some/path/dir1/file1-1.txt [1 of 2] dir1/file1-2.txt -> s3://icyboy/some/path/dir1/file1-2.txt [2 of 2] xupeng@icyboy ~ $ s3cmd put -r dir1/ s3://icyboy/some/path/ dir1/file1-1.txt -> s3://icyboy/some/path/file1-1.txt [1 of 2] dir1/file1-2.txt -> s3://icyboy/some/path/file1-2.txt [2 of 2] #2.同步 xupeng@icyboy ~ $ s3cmd sync ./ s3://icyboy/some/path/ dir2/file2-1.log -> s3://icyboy/some/path/dir2/file2-1.log [1 of 2] dir2/file2-2.txt -> s3://icyboy/some/path/dir2/file2-2.txt [2 of 2] xupeng@icyboy ~ $ s3cmd sync --dry-run --delete-removed ~/demo/ s3://icyboy/some/path/ delete: s3://icyboy/some/path/file1-1.txt delete: s3://icyboy/some/path/file1-2.txt upload: ~/demo/dir1/file1-2.txt -> s3://icyboy/some/path/dir1/file1-2.txt WARNING: Exiting now because of --dry-run xupeng@icyboy ~ $ s3cmd sync --dry-run --skip-existing --delete-removed ~/demo/ s3://icyboy/some/path/ delete: s3://icyboy/some/path/file1-1.txt delete: s3://icyboy/some/path/file1-2.txt WARNING: Exiting now because of --dry-run xupeng@icyboy ~ $ s3cmd sync --dry-run --exclude '*.txt' --include 'dir2/*' . s3://icyboy/demo/ exclude: dir1/file1-1.txt exclude: dir1/file1-2.txt exclude: file0-2.txt upload: ./dir2/file2-1.log -> s3://icyboy/demo/dir2/file2-1.log upload: ./dir2/file2-2.txt -> s3://icyboy/demo/dir2/file2-2.txt upload: ./file0-1.msg -> s3://icyboy/demo/file0-1.msg upload: ./file0-3.log -> s3://icyboy/demo/file0-3.log WARNING: Exiting now because of --dry-run