Application-Level Protocols

应用层协议

A client and a server exchange messages consisting of message types and message data. This requires design of a suitable message exchange protocol. This chapter looks at some of the issues involved in this, and gives a complete example of a simple client-server application.

客户端和服务器的交互包括消息类型和消息数据,这就需要有适当的交互协议。本章着重讨论客户端和服务器交互相关的问题,并给出一个完整又简单的客户端服务器交互的例子。

Introduction

介绍

A client and server need to exchange information via messages. TCP and UDP provide the transport mechanisms to do this. The two processes also need to have a protocol in place so that message exchange can take place meaningfully. A protocol defines what type of conversation can take place between two components of a distributed application, by specifying messages, data types, encoding formats and so on.

客户端和服务器需要通过消息来进行交互。TCP和UDP是信息交互的两种传输机制。在这两种传输机制之上就需要有协议来约定传输内容的含义。协议清楚说明分布式应用的两个模块之间交互消息的消息体、消息的数据类型、编码格式等。

Protocol Design

协议设计

There are many possibilities and issues to be decided on when designing a protocol. Some of the issues include:

当设计协议的时候,有许多许多的情况和问题需要考虑,比如:

Version control

版本控制

A protocol used in a client/server system will evolve over time, changing as the system expands. This raises compatability problems: a version 2 client will make requests that a version 1 server doesn't understand, whereas a version 2 server will send replies that a version 1 client won't understand.

随着时间变化和系统的升级,客户端/服务器之间的协议也会升级。这可能会引起兼容性的问题:版本2的客户端发出的请求可能版本1的服务器无法解析,反之也一样,版本2的服务器回复的消息版本1的客户端无法解析。

Each side should ideally be able to understand messages for its own version and all earlier ones. It should be able to write replies to old style queries in old style response format.

理想情况下,不论是哪一端,都应该既能满足自己当前版本的消息规范,也能满足早期版本的消息规范。任意一端对于旧版本的请求应该返回旧版本的响应。

The ability to talk earlier version formats may be lost if the protocol changes too much. In this case, you need to be able to ensure that no copies of the earlier version still exist - and that is generally imposible.

但是如果协议变化太大的话,可能就很难保持与早期版本的兼容了。在这种情况下,你就需要保证已经不存在早期的版本了 -- 当然这个几乎是不可能的。

Part of the protocol setup should involve version information.

所以,协议应该包含有版本消息。

The Web

Web协议

The Web is a good example of a system that is messed up by different versions. The protocol has been through three versions, and most servers/browsers now use the latest version. The version is given in each request

Web协议就是一个由于有不同协议版本同时存在而出现混乱的例子。Web协议已经有三个版本了,通常服务器和浏览器都是使用最新的版本,版本消息包含在请求中:

request version
GET / pre 1.0
GET / HTTP/1.0 HTTP 1.0
GET / HTTP/1.1 HTTP 1.1

But the content of the messages has been through a large number of versions:

但是消息体的内容已经被大量版本制定修改过:

Message Format

消息格式

In the last chapter we discussed some possibilities for representing data to be sent across the wire. Now we look one level up, to the messages which may contain such data.

上一章我们讨论了数据传输的几种可能的表现形式。现在我们进一步研究包含数据的消息。

Commonly, the first part of the message will be a message type.

通常来说,消息的头部必须包含消息类型。

The message types can be strings or integers. e.g. HTTP uses integers such as 404 to mean "not found" (although these integers are written as strings). The messages from client to server and vice versa are disjoint: "LOGIN" from client to server is different to "LOGIN" from server to client.

消息类型应该设置为字符型或者整型。比如,HTTP使用整数404来表示“未找到资源”(尽管这个整型是被当做字符串使用)。客户端到服务器的消息和服务器到客户端的消息是不一样的:比如从客户端到服务器的“LOGIN”消息就不同于服务器到客户端的“LOGIN”消息。

Data Format

数据格式

There are two main format choices for messages: byte encoded or character encoded.

对于消息来说,有两种主要的数据格式可供选择:字节编码和字符编码。

Byte format

字节编码

In the byte format

对于字节编码

The advantages are compactness and hence speed. The disadvantages are caused by the opaqueness of the data: it may be harder to spot errors, harder to debug, require special purpose decoding functions. There are many examples of byte-encoded formats, including major protocols such as DNS and NFS , upto recent ones such as Skype. Of course, if your protocol is not publicly specified, then a byte format can also make it harder for others to reverse-engineer it!

字节编码的优势就是紧凑小巧,传输速度快。劣势就是数据的不透明性:字节编码很难定位错误,也很难调试。往往是要求写一些额外的解码函数。有许多字节编码格式的例子,大部分协议都是使用字节编码,例如DNS和NFS协议,还有最近出现的Skype协议。当然,如果你的协议没有公开说明结构,使用字节编码可以让其他人使用反向工程手段很难破解!

Pseudocode for a byte-format server is

字节编码的服务器的伪代码如下


    handleClient(conn) {
        while (true) {
            byte b = conn.readByte()
            switch (b) {
                case MSG_1: ...
                case MSG_2: ...
                ...
            }
        }
    }

Go has basic support for managing byte streams. The interface Conn has methods

Go提供了基本的管理字节流的方法。 接口Conn 包含有方法


(c Conn) Read(b []byte) (n int, err os.Error)
(c Conn) Write(b []byte) (n int, err os.Error)
    

and these methods are implemented by TCPConn and UDPConn.

这两个方法的具体实现类有 TCPConn and UDPConn

Character Format

字符编码

In this mode, everything is sent as characters if possible. For example, an integer 234 would be sent as, say, the three characters '2', '3' and '4' instead of the one byte 234. Data that is inherently binary may be base64 encoded to change it into a 7-bit format and then sent as ASCII characters, as discussed in the previous chapter.

在这个编码模式下,所有消息都尽可能以字符的形式发送。例如,整型数字234会被处理成三个字符‘2’,‘3’,‘4’,而不会被处理成234的字节码。二进制数据将会使用base64编码变成为7-bit的格式,然后当做ASCII码传递,就和我们上一章讨论的一样。

In character format,

对于字符编码,

Pseudocode is

伪代码如下


handleClient() {
    line = conn.readLine()
    if (line.startsWith(...) {
        ...
    } else if (line.startsWith(...) {
        ...
    }
}

Character formats are easier to setup and easier to debug. For example, you can use telnet to connect to a server on any port, and send client requests to that server. It isn't so easy the other way, but you can use tools like tcpdump to snoop on TCP traffic and see immediately what clients are sending to servers.

字符编码很容易进行组装,也很容易调试。例如,你可以telnet连接到一台服务器的端口上,然后发送客户的请求到服务器。其他的编码方式无法轻易地监听请求。但是对于字符编码,你可以使用tcpdump 这样的工具监听TCP的交互,并且立刻就能看到客户端发送给服务器端的消息。

There is not the same level of support in Go for managing character streams. There are significant issues with character sets and character encodings, and we will explore these issues in a later chapter.

在Go中没有像字节流那样专门处理字符流的工具。如何处理字符集和字符编码是非常重要的,我们将会在下一章专门讨论这些问题。

If we just pretend everything is ASCII, like it was once upon a time, then character formats are quite straightforward to deal with. The principal complication at this level is the varying status of "newline" across different operating systems. Unix uses the single character '\n'. Windows and others (more correctly) use the pair "\r\n". On the internet, the pair "\r\n" is most common - Unix systems just need to take care that they don't assume '\n'.

如果和以前一样,处理的所有字符都是ASCII码,那么我们能直接又简单地处理这些字符。但是实际上,字符处理复杂的原因是不同的操作系统上有各种不统一的“换行符”。Unix使用简单的'\n' 来表示换行,Windows和其他的系统(这种方法更正确)使用“\r\n”来表示。在实际的网络传输中,使用一对“\r\n”是更通用的方案 -- 因为Unix系统只需要注意不要设定换行符只有“\n”就可以满足这个方案。

Simple Example

简单的例子

This example deals with a directory browsing protocol - basically a stripped down version of FTP, but without even the file transfer part. We only consider listing a directory name, listing the contents of a directory and changing the current directory - all on the server side, of course. This is a complete worked example of creating all components of a client-server application. It is a simple program which includes messages in both directions, as well as design of messaging protocol.

这个例子展示的是一个文件夹浏览协议 -- 基本上就是一个简单的FTP协议,只是连FTP的文件传输都没有实现。我们考虑这个例子包含的功能有:展示文件夹名称,列出文件夹内包含的文件,改变当前文件夹路径 -- 当然所有这些文件都是在服务器的。这是一个完整的包含客户端和服务器的例子。这个简单的程序既需要两个方向的消息交互,也需要消息的具体协议设计。

Look at a simple non-client-server program that allows you to list files in a directory and change and print the directory on the server. We omit copying files, as that adds to the length of the program without really introducing important concepts. For simplicity, all filenames will be assumed to be in 7-bit ASCII. If we just looked at a standalone application first, then the pseudo-code would be

在开始例子之前,我们先看一个简单的程序,这个程序不是客户端和服务器交互的程序,它实现的功能包括:展示文件夹中的文件,打印出文件夹在服务器上的路径。在这里我们忽略正在拷贝中的文件,因为考虑这些细节会增加代码长度,却对我们要介绍的重要概念没有什么帮助。简单假设:所有的文件名都是7位的ASCII码。先考虑这个独立的程序,它的伪代码应该是:


read line from user
while not eof do
  if line == dir
    list directory
  else

  if line == cd <dir>
    change directory
  else

  if line == pwd
    print directory
  else

  if line == quit
    quit
  else
    complain

  read line from user

A non-distributed application would just link the UI and file access code

一个非分布式的应用是将UI和文件存储代码连接起来

In a client-server situation, the client would be at the user end, talking to a server somewhere else. Aspects of this program belong solely at the presentation end, such as getting the commands from the user. Some are messages from the client to the server, some are solely at the server end.

在包含有客户端和服务器的情况下,客户端就代表用户终端,用来和服务器交互。这个程序最独立的部分就是表现层,比如如何获取用户的命令等。这个程序的消息有的是从客户端到服务器,有的只是在服务器。

For a simple directory browser, assume that all directories and files are at the server end, and we are only transferring file information from the server to the client. The client side (including presentation aspects) will become

对于简单的文件夹浏览器来说,假设所有的文件夹和文件都是在服务器端,我们也只需要从服务器传递文件消息给客户端。客户端的伪代码(包括表现层)应该如下:


read line from user
while not eof do
  if line == dir
    list directory
  else

  if line == cd <dir>
    change directory
  else

  if line == pwd
    print directory
  else

  if line == quit
    quit
  else
    complain

  read line from user

where the italicised lines involve communication with the server.

上面斜体字的部分是代表需要与服务器进行交互的命令。

Alternative presentation aspects

改变表现层

A GUI program would allow directory contents to be displayed as lists, for files to be selected and actions such as change directory to be be performed on them. The client would be controlled by actions associated with various events that take place in graphical objects. The pseudo-code might look like

GUI程序可以很方便展示文件夹内容,选择文件,做一些诸如改变文件夹路径的操作。客户端被图形化对象中的各种定义好的事件所驱动从而实现功能。伪代码如下:


change dir button:
  if there is a selected file
    change directory
  if successful
    update directory label
    list directory
    update directory list

The functions called from the different UI's should be the same - changing the presentation should not change the networking code

不同的UI实现的功能都是一样的 -- 改变表现层并不需要改变网络传输的代码

Protocol - informal

协议 -- 概述

client request server response
dir send list of files
cd <dir> change dir
send error if failed
send ok if succeed
pwd send current directory
quit quit

Text protocol

文本传输协议

This is a simple protocol. The most complicated data structure that we need to send is an array of strings for a directory listing. In this case we don't need the heavy duty serialisation techniques of the last chapter. In this case we can use a simple text format.

这是一个简单的协议,最复杂的部分就是我们需要使用字符串数组来列出文件夹中内容。所以,我们就不使用最后一章讲到的繁琐复杂的序列化技术了,仅仅使用一种简单的文本格式就好了。

But even if we make the protocol simple, we still have to specify it in detail. We choose the following message format:

但是实际上,即使我们想尽量使得协议简单,在细节上也需要考虑清楚。我们使用下面的消息格式约定:

Some of the choices made above are weaker in real-life protocols. For example

实际上,上面的一些考虑在真实的协议中是远远不够的。比如

All of these variations exist in real protocols. Cumulatively, they make the string processing just more complex than in our case.

所有以上的变化和考虑都会在真实使用的协议中出现。渐渐地,这些会导致实际的字符处理程序比我们的这个例子复杂。

client request server response
send "DIR" send list of files, one per line
terminated by a blank line
send "CD <dir>" change dir
send "ERROR" if failed
send "OK"
send "PWD" send current working directory

Server code

服务器代码


/* FTP Server
 */
package main

import (
        "fmt"
        "net"
        "os"
)

const (
        DIR = "DIR"
        CD  = "CD"
        PWD = "PWD"
)

func main() {

        service := "0.0.0.0:1202"
        tcpAddr, err := net.ResolveTCPAddr("tcp", service)
        checkError(err)

        listener, err := net.ListenTCP("tcp", tcpAddr)
        checkError(err)

        for {
                conn, err := listener.Accept()
                if err != nil {
                        continue
                }
                go handleClient(conn)
        }
}

func handleClient(conn net.Conn) {
        defer conn.Close()

        var buf [512]byte
        for {
                n, err := conn.Read(buf[0:])
                if err != nil {
                        conn.Close()
                        return
                }

                s := string(buf[0:n])
                // decode request
         if s[0:2] == CD {
                        chdir(conn, s[3:])
                } else if s[0:3] == DIR {
                        dirList(conn)
                } else if s[0:3] == PWD {
                        pwd(conn)
                }

        }
}

func chdir(conn net.Conn, s string) {
        if os.Chdir(s) == nil {
                conn.Write([]byte("OK"))
        } else {
                conn.Write([]byte("ERROR"))
        }
}

func pwd(conn net.Conn) {
        s, err := os.Getwd()
        if err != nil {
                conn.Write([]byte(""))
                return
        }
        conn.Write([]byte(s))
}

func dirList(conn net.Conn) {
        defer conn.Write([]byte("\r\n"))

        dir, err := os.Open(".")
        if err != nil {
                return
        }

        names, err := dir.Readdirnames(-1)
        if err != nil {
                return
        }
        for _, nm := range names {
                conn.Write([]byte(nm + "\r\n"))
        }
}

func checkError(err error) {
        if err != nil {
                fmt.Println("Fatal error ", err.Error())
                os.Exit(1)
        }
}

Client code

客户端代码


/* FTPClient
 */
package main

import (
        "fmt"
        "net"
        "os"
        "bufio"
        "strings"
        "bytes"
)

// strings used by the user interface
const (
        uiDir  = "dir"
        uiCd   = "cd"
        uiPwd  = "pwd"
        uiQuit = "quit"
)

// strings used across the network
const (
        DIR = "DIR"
        CD  = "CD"
        PWD = "PWD"
)

func main() {
        if len(os.Args) != 2 {
                fmt.Println("Usage: ", os.Args[0], "host")
                os.Exit(1)
        }

        host := os.Args[1]

        conn, err := net.Dial("tcp", host+":1202")
        checkError(err)

        reader := bufio.NewReader(os.Stdin)
        for {
                line, err := reader.ReadString('\n')
                // lose trailing whitespace
         line = strings.TrimRight(line, " \t\r\n")
                if err != nil {
                        break
                }

                // split into command + arg
         strs := strings.SplitN(line, " ", 2)
                // decode user request
         switch strs[0] {
                case uiDir:
                        dirRequest(conn)
                case uiCd:
                        if len(strs) != 2 {
                                fmt.Println("cd <dir>")
                                continue
                        }
                        fmt.Println("CD \"", strs[1], "\"")
                        cdRequest(conn, strs[1])
                case uiPwd:
                        pwdRequest(conn)
                case uiQuit:
                        conn.Close()
                        os.Exit(0)
                default:
                        fmt.Println("Unknown command")
                }
        }
}

func dirRequest(conn net.Conn) {
        conn.Write([]byte(DIR + " "))

        var buf [512]byte
        result := bytes.NewBuffer(nil)
        for {
                // read till we hit a blank line
         n, _ := conn.Read(buf[0:])
                result.Write(buf[0:n])
                length := result.Len()
                contents := result.Bytes()
                if string(contents[length-4:]) == "\r\n\r\n" {
                        fmt.Println(string(contents[0 : length-4]))
                        return
                }
        }
}

func cdRequest(conn net.Conn, dir string) {
        conn.Write([]byte(CD + " " + dir))
        var response [512]byte
        n, _ := conn.Read(response[0:])
        s := string(response[0:n])
        if s != "OK" {
                fmt.Println("Failed to change dir")
        }
}

func pwdRequest(conn net.Conn) {
        conn.Write([]byte(PWD))
        var response [512]byte
        n, _ := conn.Read(response[0:])
        s := string(response[0:n])
        fmt.Println("Current dir \"" + s + "\"")
}

func checkError(err error) {
        if err != nil {
                fmt.Println("Fatal error ", err.Error())
                os.Exit(1)
        }
}

State

状态

Applications often make use of state information to simplify what is going on. For example

应用程序经常保存状态消息来简化下面要做的事情,比如

In a distributed system, such state information may be kept in the client, in the server, or in both.

在分布式的系统中,这样的状态消息可能是保存在客户端,服务器,也可能两边都保存。

The important point is to whether one process is keeping state information about itself or about the other process. One process may keep as much state information about itself as it wants, without causing any problems. If it needs to keep information about the state of the other process, then problems arise: the process' actual knowledge of the state of the other may become incorrect. This can be caused by loss of messages (in UDP), by failure to update, or by s/w errors.

最重要的一点是,进程是否需要保存 自身进程 或者其他进程 的状态消息。一个进程保存再多自己的状态信息,也不会引发其他问题。如果需要保存其他进程的状态消息,这个问题就复杂了:当前保存的其他进程的状态消息和实际的状态消息可能是不一致的。这可能会引起消息丢失(在UDP中)、更新失败、或者s/w错误等。

An example is reading a file. In single process applications the file handling code runs as part of the application. It maintains a table of open files and the location in each of them. Each time a read or write is done this file location is updated. In the DCE file system, the file server keeps track of a client's open files, and where the client's file pointer is. If a message could get lost (but DCE uses TCP) these could get out of synch. If the client crashes, the server must eventually timeout on the client's file tables and remove them.

一个例子就是读取文件。在单个进程中,文件处理代码是应用程序的一部分。它维持一个表,表中包含所有打开的文件和文件指针位置。每次文件读写的时候,文件指针位置就会更新。在数据通信(DCE)文件系统中,文件系统必须追踪客户端打开了哪些文件,客户端的文件指针在哪。如果一个消息丢失了(但是DCE是使用TCP的),这些状态消息就不能保持同步了。如果出现客户端崩溃了,服务器就必须对这个表触发超时并删除。

In NFS, the server does not maintain this state. The client does. Each file access from the client that reaches the server must open the file at the appropriate point, as given by the client, to perform the action.

在NFS文件系统中,服务器并没有保存这个状态消息,而是有客户端保存的。客户端每次在服务器进行的读取文件操作必须能在准确的文件位置打开文件,而这个文件位置是由客户端提供的,从而才能进行后续的操作。

If the server maintains information about the client, then it must be able to recover if the client crashes. If information is not saved, then on each transaction the client must transfer sufficient information for the server to function.

如果由服务器保持客户端的状态消息,服务器必须在客户端崩溃的时候进行修复。如果服务器没有储存状态消息,那么客户端的每次事务交互都需要提供足够的消息来让服务器进行操作。

If the connection is unreliable, then additional handling must be in place to ensure that the two do not get out of synch. The classic example is of bank account transactions where the messages get lost. A transaction server may need to be part of the client-server system.

如果连接是不可靠的,那么必须要有额外的处理程序来确保双方没有失去同步。一个消息丢失的典型例子就是银行账号交易系统。交易系统是客户端与服务器交互的一部分。

Application State Transition Diagram

应用状态转换图

A state transition diagram keeps track of the current state of an application and the changes that move it to new states.

一个状态转换图清晰说明了当前应用的状态和进入到新的状态需要的转换。

Example: file transfer with login:

例如:带登陆功能的文件传输:

This can also be expressed as a table

这个也可以使用一个表来表示

Current state Transition Next state
login login failed login
login succeeded file transfer
file transfer dir file transfer
get file transfer
logout login
quit -

Client state transition diagrams

客户端状态转换图

The client state diagram must follow the application diagram. It has more detail though: it writes and then reads

客户端状态转换图就和应用转换图一样。不同的就是要注意更多细节:它包含有操作

Current state Write Read Next state
login LOGIN name password FAILED login
SUCCEEDED file transfer
file transfer CD dir SUCCEEDED file transfer
FAILED file transfer
GET filename #lines + contents file transfer
ERROR file transfer
DIR #files + filenames file transfer
ERROR file transfer
quit none quit
logout none login

Server state transition diagrams

服务器状态转换图

The server state diagram must also follow the application diagram. It also has more detail: it reads and then writes

服务器状态转换图也和应用转换图一样。不同的就是也要注意更多细节:它包含有 操作

Current state Read Write Next state
login LOGIN name password FAILED login
SUCCEEDED file transfer
file transfer CD dir SUCCEEDED file transfer
FAILED file transfer
GET filename #lines + contents file transfer
ERROR file transfer
DIR #files + filenames file transfer
ERROR file transfer
quit none quit
logout none login

Server pseudocode

服务器伪代码


state = login
while true
    read line
    switch (state)
        case login:
            get NAME from line
            get PASSWORD from line
            if NAME and PASSWORD verified
                write SUCCEEDED
                state = file_transfer
            else
                write FAILED
                state = login
        case file_transfer:
            if line.startsWith CD
                get DIR from line
                if chdir DIR okay
                    write SUCCEEDED
                    state = file_transfer
                else
                    write FAILED
                    state = file_transfer
            ...

We don't give the actual code for this server or client since it is pretty straightforward.

由于这个伪代码已经足够清晰了,所以我们并不用给出具体的代码了。

Summary

总结

Building any application requires design decisions before you start writing code. For distributed applications you have a wider range of decisions to make compared to standalone systems. This chapter has considered some of those aspects and demonstrated what the resultant code might look like.

任何应用程序在开始编写前都需要详尽的设计。开发一个分布式的系统比开发一个独立系统需要更宽广的视野和思维来做决定和思考。这一章已经考虑到了一些这样的问题,并且展示了最终代码的大致样子。

Copyright Jan Newmarch, jan@newmarch.name

If you like this book, please contribute using Flattr
or donate using PayPal