OpenResty® Scalable Web Platform by Extending NGINX with Lua

OpenResty® C Coding Style Guide

Yichun Zhang (agentzh) , 14 May 2019 (created 11 Feb 2019)

OpenResty follows NGINX's coding style in its C language components, like OpenResty's own NGINX add-on modules and OpenResty's own Lua libraries' C parts. Unfortunately even the NGINX core's own C source code may not strictly follow the same convention as the rest of the code base. It is so desired to prepare a formal guideline document to avoid any ambiguity.

Patches contributed to the OpenResty core projects should always follow this guideline otherwise they will not pass the review process and will not get merged as is. The OpenResty and NGINX communities are always encouraged to follow this guideline when developing their own add-on modules and libraries in C.

Naming convention

For NGINX related C code, source file names (including .c and .h files), global variables, global functions, C struct/union/enum names, compilation-unit scoped static variables and functions, as well as public macros defined in header files should always use full-qualified names like ngx_http_core_module.c, ngx_http_finalize_request and NGX_HTTP_MAIN_CONF. This is important because the C language does not have the concept of explicit namespaces like in C++. Using fully-qualified names help avoid symbol clashes and also help debugging.

In Lua libraries' C components, we should also use prefixes like resty_blah_ (if the library is called lua-resty-blah) for all the top-level C symbols in the corresponding C compilation units

We should use short names for local variables defined in C functions. Common short names used extensively in the NGINX core are cl, ev, ctx, v, p, q, and etc. Such variables are usually short-lived and have very limited scope. According to the Huffman principle, we should use short names for commonly used stuff in the current context to avoid line noises. Even short names should following NGINX's convention. Do not invent your own unless necessary. And do use meaningful names. Even for p and q, they are common names for string pointer variables used in the context of string processing.

C struct and union names should use the full spelling form of words wherever possible (unless the member name would be too long). For example, in NGINX core's struct ngx_http_request_s, we have long member names like read_event_handler, upstream_states, and request_body_in_persistent_file.

We should use _t suffix for typedef type names referring to structs, _s for struct names, and _e for typedef type names referring to enums. Local types defined in function scopes are not subject to this suffix convention. Below are some examples from the NGINX core:

typedef struct ngx_connection_s      ngx_connection_t;
typedef struct {
    WSAOVERLAPPED    ovlp;
    ngx_event_t     *event;
    int              error;
} ngx_event_ovlp_t;
struct ngx_chain_s {
    ngx_buf_t    *buf;
    ngx_chain_t  *next;
};
typedef enum {
    ngx_pop3_start = 0,
    ngx_pop3_user,
    ...
    ngx_pop3_auth_external
} ngx_pop3_state_e;

Indentation

The NGINX world uses spaces exclusively for indentation. Do not use tabs! Usually we use 4-space indentation unless there is some special alignment requirements or some other requirements in certain cases (we will explain such cases in detail below).

Always indent your code properly.

The 80 column limit

All the source code lines should be kept within the 80 column limit (some code in the NGINX core even keep at 78 columns, but I suggest 80 columns as the hard limit). Different contexts will have different indentation rules for the indentations used in the continued lines. We will discuss each cases below in detail.

Line trailing white-spaces

There should never be any spaces or tabs at the end of source lines, not even blank lines. Many editors support highlighting or trimming such white-space characters automatically on the user's behalf. Make sure you configure your editor or IDE properly.

Function declarations

C function declarations (not definitions!) used in header files or at the beginning of .c files should put everything in a single line if possible. Below is an example from the NGINX core:

ngx_int_t ngx_http_send_special(ngx_http_request_t *r, ngx_uint_t flags);

If the line is too long, exceeding 80 columns, then we should split the declaration into more lines, with a 4-space indentation. For example,

ngx_int_t ngx_http_filter_finalize_request(ngx_http_request_t *r,
    ngx_module_t *m, ngx_int_t error);

If the return type is a pointer type, then there should be a space before but not after the first *, as in

char *ngx_http_types_slot(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);

Please note that function definitions follow a different style than declarations. See Function definitions for more details.

Function definitions

C function definitions follow a different style than their declarations (see Function declarations). The first line should be the return type alone, the 2nd line goes the function name as well as the parameter list, and the 3rd line goes the opening curly bracket alone. Below is an example from the NGINX core:

ngx_int_t
ngx_http_compile_complex_value(ngx_http_compile_complex_value_t *ccv)
{
    ...
}

Please note that there is no spaces around the ( character for the parameter list. And there's no indentations for the first 3 lines.

If the parameter list is too long, like exceeding the 80 column limit, then we can break up the parameter list into separate lines with a 4-space indentation for each following lines. Below is such an example from the NGINX core:

ngx_int_t
ngx_http_complex_value(ngx_http_request_t *r, ngx_http_complex_value_t *val,
    ngx_str_t *value)
{
    ...
}

If the return type is a pointer type, then there should be a space before the first *, like this:

static char *
ngx_http_core_pool_size(ngx_conf_t *cf, void *post, void *data)
{
    ...
}

Local variables

In section Naming convention, we require local variables to use shorter names like ev, clcf, and etc. Their definitions also have some style requirements.

They should always be put at the beginning of each C function definition block, not just at the beginning of any arbitrary code block, unless to aid debugging or some other special requirements. Also, their variable identifiers (excluding any * prefixes), must be aligned up vertically. Below is an example from the NGINX core:

    ngx_str_t            *value;
    ngx_uint_t            i;
    ngx_regex_elt_t      *re;
    ngx_regex_compile_t   rc;
    u_char                errstr[NGX_MAX_CONF_ERRSTR];

Please note how the identifiers value, i, re, rc, and errstr are aligned up vertically. The * prefix does not count in this alignment.

Some times, some local variable's definition may be exceptionally long, aligning it with the rest of the variables may make the code ugly. Then we should put a single blank line between this long variable definition and the rest of the local variable definitions. In this case, the two groups' identifiers do not need to be aligned vertically. Below is such an example:

static char *
ngx_http_core_open_file_cache(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    ngx_http_core_loc_conf_t *clcf = conf;

    time_t       inactive;
    ngx_str_t   *value, s;
    ngx_int_t    max;
    ngx_uint_t   i;
    ...
}

Note how the variable clcf's definition is separated by a blank line with the rest of the local variables. The rest of the local variables still align up vertically.

The local variables declarations also must be followed by a blank line which separate them from the actual execution code statements of the current C function. For example:

u_char * ngx_cdecl
ngx_sprintf(u_char *buf, const char *fmt, ...)
{
    u_char   *p;
    va_list   args;

    va_start(args, fmt);
    p = ngx_vslprintf(buf, (void *) -1, fmt, args);
    va_end(args);

    return p;
}

There is a blank line right after the local variable definitions.

Use of blank lines

Successive C function definitions, multi-line global/static variable definitions, and struct/union/enum definitions must be separated by 2 blank lines. Below is an example for successive C function definitions:

void
foo(void)
{
    /* ... */
}


int
bar(...)
{
    /* ... */
}

And here is an example for successive static variable definitions:

static ngx_conf_bitmask_t  ngx_http_core_keepalive_disable[] = {
    ...
    { ngx_null_string, 0 }
};


static ngx_path_init_t  ngx_http_client_temp_path = {
    ngx_string(NGX_HTTP_CLIENT_TEMP_PATH), { 0, 0, 0 }
};

Single-line variable definitions may be grouped together, as in

static ngx_str_t  ngx_http_gzip_no_cache = ngx_string("no-cache");
static ngx_str_t  ngx_http_gzip_no_store = ngx_string("no-store");
static ngx_str_t  ngx_http_gzip_private = ngx_string("private");

Below is an example for successive (multi-line) struct definitions:

struct ngx_http_log_ctx_s {
    ngx_connection_t    *connection;
    ngx_http_request_t  *request;
    ngx_http_request_t  *current_request;
};


struct ngx_http_chunked_s {
    ngx_uint_t           state;
    off_t                size;
    off_t                length;
};


typedef struct {
    ngx_uint_t           http_version;
    ngx_uint_t           code;
    ngx_uint_t           count;
    u_char              *start;
    u_char              *end;
} ngx_http_status_t;

All separated by 2 blank lines.

And if different kinds of these top-level object definitions should also be separated by 2 blank lines if they are neighbors, for example:

#if (NGX_HTTP_DEGRADATION)
ngx_uint_t  ngx_http_degraded(ngx_http_request_t *);
#endif


extern ngx_module_t  ngx_http_module;

The static function declaration is separated by 2 blank lines from the following C global variable declaration.

Successive C function declarations do not use 2 blank lines to separate each other, as in

ngx_int_t ngx_http_discard_request_body(ngx_http_request_t *r);
void ngx_http_discarded_request_body_handler(ngx_http_request_t *r);
void ngx_http_block_reading(ngx_http_request_t *r);
void ngx_http_test_reading(ngx_http_request_t *r);

Even when some of them span multiple lines, as in

char *ngx_http_merge_types(ngx_conf_t *cf, ngx_array_t **keys,
    ngx_hash_t *types_hash, ngx_array_t **prev_keys,
    ngx_hash_t *prev_types_hash, ngx_str_t *default_types);
ngx_int_t ngx_http_set_default_types(ngx_conf_t *cf, ngx_array_t **types,
    ngx_str_t *default_type);

Still, sometimes we could use 2 blank lines to separate them into semantically meaningful groups, for better code readability, as in

ngx_int_t ngx_http_send_header(ngx_http_request_t *r);
ngx_int_t ngx_http_special_response_handler(ngx_http_request_t *r,
    ngx_int_t error);
ngx_int_t ngx_http_filter_finalize_request(ngx_http_request_t *r,
    ngx_module_t *m, ngx_int_t error);
void ngx_http_clean_header(ngx_http_request_t *r);


ngx_int_t ngx_http_discard_request_body(ngx_http_request_t *r);
void ngx_http_discarded_request_body_handler(ngx_http_request_t *r);
void ngx_http_block_reading(ngx_http_request_t *r);
void ngx_http_test_reading(ngx_http_request_t *r);

The first group is mostly about response headers while the latter group is for request bodies.

Type casting

The C language does not require explicit type casting when assigning the value of a void pointer (void *) to a non-void pointer. And the NGINX coding style does not require that either. For instance:

char *
ngx_http_types_slot(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
    char  *p = conf;
    ...
}

Here the conf variable is a void pointer and the NGINX core assign it to the local variable p of the type char * without any explicit type casting.

When explicit type casting is needed, make sure there is a space before the first * character for the target pointer type name, and also a space after the ) character, as in

*types = (void *) -1;

There is a space before *) and also a space after ). This also applies to the case when the value to be type-casted is an example:

if ((size_t) (last - buf) < len) {
    ...
}

Or multiple successive type casting:

aio->aiocb.aio_data = (uint64_t) (uintptr_t) ev;

Note the space between (uint64_t) and (uintptr_t), as well as the space after (uintptr_t).

If statements

NGINX's use of C's if statements also have some style requirements.

First of all, there must be a space after the if keyword, and also a space between the condition's closing parenthesis and the opening curly bracket. That is,

if (a > 3) {
    ...
}

Note the space after if and the space before {. Note, however, there is no spaces right after ( or right before ).

Also note that the opening curly bracket must be on the same line as the if keyword, unless this line would exceed 80 columns, in which case, we should split the condition into multiple lines and put the opening curly bracket on its own line. The following example demonstrates this:

            if (ngx_http_set_default_types(cf, prev_keys, default_types)
                != NGX_OK)
            {
                return NGX_CONF_ERROR;
            }

Note how != OK is aligned up vertically with the condition part (excluding () of the if statement.

When logical operators are involved in the long condition part, then we should make sure the connecting logical operators are at the beginning of the subsequent lines and the indentation reflects the nesting structure of the condition expression, as in

        if (file->use_event
            || (file->event == NULL
                && (of->uniq == 0 || of->uniq == file->uniq)
                && now - file->created < of->valid
#if (NGX_HAVE_OPENAT)
                && of->disable_symlinks == file->disable_symlinks
                && of->disable_symlinks_from == file->disable_symlinks_from
#endif
            ))
        {
            ...
        }

We can ignore the macro directives in the middle. They are not really relevant to the coding style of the if statement itself.

Usually we should leave a blank line after the if statement's code block if there is other statements following up. For example:

        if (rc != NGX_OK && (of->err == 0 || !of->errors)) {
            goto failed;
        }

        if (of->is_dir) {
            ...
        }

Note how a blank line is used to separate successive if statement blocks. Or with some other statements:

        if (file->is_dir) {

            /*
             * chances that directory became file are very small
             * so test_dir flag allows to use a single syscall
             * in ngx_file_info() instead of three syscalls
             */

            of->test_dir = 1;
        }

        of->fd = file->fd;
        of->uniq = file->uniq;

Similarly, there is often a single blank line used before the if statement, as in

        rc = ngx_open_and_stat_file(name, of, pool->log);

        if (rc != NGX_OK && (of->err == 0 || !of->errors)) {
            goto failed;
        }

Use of blank lines around such code blocks help make the code less crowded. The same applies to "while" statements, for statements, and etc.

If statements must always use curly brackets even when the "then" branch has only a single statement. For instance,

            if (file->is_dir || file->err) {
                goto update;
            }

We must not omit the curly braces in such cases even though the standard C language allows that.

else part

When the if statement takes an else branch, then it also must take curly braces to group the contained statements. Also, a blank line must be used before the } else { line. Below is an example:

    if (of->disable_symlinks == NGX_DISABLE_SYMLINKS_NOTOWNER
        && !(create & (NGX_FILE_CREATE_OR_OPEN|NGX_FILE_TRUNCATE)))
    {
        fd = ngx_openat_file_owner(at_fd, p, mode, create, access, log);

    } else {
        fd = ngx_openat_file(at_fd, p, mode|NGX_FILE_NOFOLLOW, create, access);
    }

Note how } else { is put on the same line and there is a blank line right before the } else { line.

For statements

The for statement is similar to the if statement style explained in section If statements in many ways. A space is also required after the for keyword and also before {. Additionally, curly braces must be used for the contained statements. Furthermore, a space is required right after ; in the for condition part. The following example demonstrates these requirements:

for (i = 0; i < size; i++) {
    ...
}

A special case is the infinite loop, which is usually encoded as below in the NGINX world:

    for ( ;; ) {
        ...
    }

Or when comma expressions are used in the for statement's condition part:

    for (i = 0, n = 2; n < cf->args->nelts; i++, n++) {
        ...
    }

Or when the loop condition alone is omitted:

    for (p = pool, n = pool->d.next; /* void */; p = n, n = n->d.next) {
        ...
    }

While statements

The while statement is similar to the if statement style explained in section If statements in many ways. A space is also required after the while keyword and also before {. Additionally, curly braces must be used for the contained statements. Below is an example:

    while (log->next) {
        if (new_log->log_level > log->next->log_level) {
            new_log->next = log->next;
            log->next = new_log;
            return;
        }

        log = log->next;
    }

Do-while statements are also similar:

        do {
            p = h2c->state.handler(h2c, p, end);

            if (p == NULL) {
                return;
            }

        } while (p != end);

Note the use of a single space between do and {, as well as single space before and after while.

Switch statements

The switch statement is similar to the if statement style explained in section If statements in many ways. A space is also required after the switch keyword and also before {. Additionally, curly braces must be used for the contained statements. Below is an example:

    switch (unit) {
    case 'K':
    case 'k':
        len--;
        max = NGX_MAX_SIZE_T_VALUE / 1024;
        scale = 1024;
        break;

    case 'M':
    case 'm':
        len--;
        max = NGX_MAX_SIZE_T_VALUE / (1024 * 1024);
        scale = 1024 * 1024;
        break;

    default:
        max = NGX_MAX_SIZE_T_VALUE;
        scale = 1;
    }

Note how the case labels are aligned vertically with the switch keyword.

Sometimes, a blank line is used before the first case label line, as in

        switch (c->log_error) {

        case NGX_ERROR_IGNORE_EINVAL:
        case NGX_ERROR_IGNORE_ECONNRESET:
        case NGX_ERROR_INFO:
            level = NGX_LOG_INFO;
            break;

        default:
            level = NGX_LOG_ERR;
        }

Allocation error handling

The NGINX world has a good habit of always checking memory dynamic allocation failures. It's everywhere, like this:

    sa = ngx_palloc(cf->pool, socklen);
    if (sa == NULL) {
        return NULL;
    }

These two statements appear together so frequently that we usually do not put a blank line between the allocation statement and the if statement.

Make sure you never omit such a check after a dynamic memory allocation statement.

Function calls

C function calls should not put any spaces around the opening or closing parentheses for the argument list. Below is an example:

sa = ngx_palloc(cf->pool, socklen);

When the function call is so long that would exceed the 80 column limit, then we should break up the argument list into separate lines. The subsequent lines must align up with the first argument vertically, as in

        buf->pos = ngx_slprintf(buf->start, buf->end, "MEMLOG %uz %V:%ui%N",
                                size, &cf->conf_file->file.name,
                                cf->conf_file->line);

Macros

Macro defintions requires a single space after #define while (at least) 2 spaces before the definition body part. For example:

#define F(x, y, z)  ((z) ^ ((x) & ((y) ^ (z))))

Some times more spaces may be used before the definition body part for the sake of vertical alignment among multiple closely related macro definitions, as in

#define NGX_RESOLVE_A         1
#define NGX_RESOLVE_CNAME     5
#define NGX_RESOLVE_PTR       12
#define NGX_RESOLVE_MX        15
#define NGX_RESOLVE_TXT       16
#define NGX_RESOLVE_AAAA      28
#define NGX_RESOLVE_SRV       33
#define NGX_RESOLVE_DNAME     39
#define NGX_RESOLVE_FORMERR   1
#define NGX_RESOLVE_SERVFAIL  2

For macro definitions spanning multiple lines, we should align up the line continuation character \ vertically, as in

#define ngx_conf_init_value(conf, default)
\
    if (conf == NGX_CONF_UNSET) {                                            \
        conf = default;                                                      \
    }

We recommend putting \ on the 78th column though the NGINX core some times disagrees with itself.

Global/Static variables

Definitions and declarations for global and static variables and top-level static variables should put at least 2 spaces between the type declarator and the variable identifier part (including any leading * modifiers). Below are some examples:

ngx_uint_t   ngx_http_max_module;


ngx_http_output_header_filter_pt  ngx_http_top_header_filter;
ngx_http_output_body_filter_pt    ngx_http_top_body_filter;
ngx_http_request_body_filter_pt   ngx_http_top_request_body_filter;

The same applies to variable definitions taking an initializer expression, as in

ngx_str_t  ngx_http_html_default_types[] = {
    ngx_string("text/html"),
    ngx_null_string
};

Operators

Binary operators

A single space is required before and after most of the binary C operators like arithmetic operators, bit operators, relational operators, and logical operators. Below are some examples:

 yday = days - (365 * year + year / 4 - year / 100 + year / 400);

and also

if (*p >= '0' && *p <= '9') {

For struct/union member operators -> and ., no spaces are allowed around them, for instance:

ls = cycle->listening.elts;

For the comma operator, a single space should be used after the comma, not before:

for (p = pool, n = pool->d.next; /* void */; p = n, n = n->d.next) {

NGINX usually avoids the comma operators except in the context of for statement conditions and in multiple variable declarations of the same type. Better split your comma expressions into separate statements in other cases.

Unary operators

We usually do not put any spaces before or after the C unary prefix operators. Below are some examples:

for (p = salt; *p && *p != '$' && p < last; p++) { /* void */ }
#define SET(n)      (*(uint32_t *) &p[n * 4])

Note that we do not put any spaces around the unary * operator or the unary & operator (the space used before & in the 2nd example above is due to the use of type casting expression; see section Type casting for more details).

The same applies to the suffix operators:

for (value = 0; n--; line++) {

Ternary operators

Ternary operators also require the use of spaces around the operators, just as with the binary operators. For example:

node = (rc < 0) ? node->left : node->right;

As we can see from this example that when the condition part of the ternary operator is an expression, we could also add a pair of parentheses around it. This is not required though.

Struct/union/enum definitions

The definition style for structs, unions, and enums are similar. They should align up the fields' identifiers vertically, in a similar way to local variable definitions explained in section Local variables. We will just give some real examples from the NGINX core to demonstrate the style:

typedef struct {
    ngx_uint_t           http_version;
    ngx_uint_t           code;
    ngx_uint_t           count;
    u_char              *start;
    u_char              *end;
} ngx_http_status_t;

Just with the case of local variable definitions, we could also use a blank line to separate out groups of fields, as in

struct ngx_http_request_s {
    uint32_t                          signature;         /* "HTTP" */

    ngx_connection_t                 *connection;

    void                            **ctx;
    void                            **main_conf;
    void                            **srv_conf;
    void                            **loc_conf;

    ngx_http_event_handler_pt         read_event_handler;
    ngx_http_event_handler_pt         write_event_handler;
    ...
};

In this case, each group still must align up the field member identifiers vertically, but different groups are not required to be aligned (although we still could, as demonstrated in the example above).

Unions are similar:

typedef union epoll_data {
    void         *ptr;
    int           fd;
    uint32_t      u32;
    uint64_t      u64;
} epoll_data_t

So are enums:

typedef enum {
    NGX_HTTP_INITING_REQUEST_STATE = 0,
    NGX_HTTP_READING_REQUEST_STATE,
    NGX_HTTP_PROCESS_REQUEST_STATE,

    NGX_HTTP_CONNECT_UPSTREAM_STATE,
    NGX_HTTP_WRITING_UPSTREAM_STATE,
    NGX_HTTP_READING_UPSTREAM_STATE,

    NGX_HTTP_WRITING_REQUEST_STATE,
    NGX_HTTP_LINGERING_CLOSE_STATE,
    NGX_HTTP_KEEPALIVE_STATE
} ngx_http_state_e;

Typedef definitions

Similar to Macros, typedef definitions also require at least 2 spaces (usually just 2) before the definition body part. For instance,

typedef u_int  aio_context_t;

More than 2 spaces can be used when a group of typedef definitions are put together and it's nice to have them align up vertically for aesthetic reasons, as in

typedef struct ngx_module_s          ngx_module_t;
typedef struct ngx_conf_s            ngx_conf_t;
typedef struct ngx_cycle_s           ngx_cycle_t;
typedef struct ngx_pool_s            ngx_pool_t;
typedef struct ngx_chain_s           ngx_chain_t;
typedef struct ngx_log_s             ngx_log_t;
typedef struct ngx_open_file_s       ngx_open_file_t;

Tools

The OpenResty team maintains the ngx-releng tool to statically scan the current C source tree for many (but not all) style issues covered in this document. It's been a must-have for OpenResty core developers and also be helpful for NGINX module developers and NGINX core hackers in general. We keep adding more checkers to this tool and we welcome your contributions as well.

The clang static code analyzer is also immensely helpful for catching subtle coding problems so does using high optimization flags of gcc to compile everything.

Many editors provide features to highlight and/or auto-trim line trailing spaces as well as expanding tabs into spaces. For example, in vim, we could put the following lines to our ~/.vimrc file to highlight any line-trailing white-spaces:

highlight WhiteSpaceEOL ctermbg=darkgreen guibg=lightgreen
match WhiteSpaceEOL /\s$/
autocmd WinEnter * match WhiteSpaceEOL /\s$/

And also to set the indentation facilities properly:

set expandtab
set shiftwidth=4
set softtabstop=4
set tabstop=4

Goto statements and code labels

NGINX uses goto statements wisely for error handling. It is a good use case for the notorious goto statement. Many inexperienced C programmers may panic upon any uses of goto statements, which is not fair. It is just bad to use goto statements to jump backward, otherwise it's usually fine, especially for error handling. NGINX requires that the code labels to be surrounded by blank lines, as in

        p = ngx_pnalloc(pool, len);
        if (p == NULL) {
            goto failed;
        }

        ...

        i++;
    }

    freeaddrinfo(res);
    return NGX_OK;

failed:

    freeaddrinfo(res);
    return NGX_ERROR;

Checking pointer nullity

In the NGINX world, we usually use p == NULL instead of !p to check if a pointer value is NULL. Follow this convention wherever possible. It is also recommended to use p != NULL instead of p to test the opposite, but it is also fine to simply use p to test in this case.

Below are some examles:

if (addrs != NULL) {
if (name == NULL) {

Testing against NULL is usually clearer about the nature of the value being checked and thus helps improve code readability.

Author

The author of this guideline is Yichun Zhang, the creator of OpenResty.

Feedback and patches

Feedback and patches are always welcome! They should go to Yichun Zhang' s email address yichun@openresty.com.