Skip to main content

Documentation Index

Fetch the complete documentation index at: https://tally.wharflab.com/llms.txt

Use this file to discover all available pages before exploring further.

Suggests using COPY heredoc for file creation instead of RUN echo/cat.
PropertyValue
SeverityInfo
CategoryPerformance
DefaultEnabled
Auto-fixYes (--fix --fix-unsafe)

Description

Suggests replacing RUN echo/cat/printf > file patterns with COPY <<EOF syntax for better performance and readability. This rule detects file creation patterns in RUN instructions and extracts them into COPY heredocs, even when mixed with other commands. It relies on Dockerfile here-documents support for COPY.

Why COPY heredoc?

Three reasons, in order of importance:
  1. Hermeticity and intent. A RUN that writes a file is an opaque shell invocation: the frontend cannot tell, without parsing your shell, whether the instruction installs a package, mutates system state, or just drops a config file. COPY <<EOF declares the same operation as a pure, fully specified input-to-output mapping. See Bazel’s hermeticity guide for the underlying principle — a build step whose output is a function of its declared inputs is easier to cache, reason about, and reproduce.
  2. Observable content for other rules. tally tracks files written via COPY <<EOF (and ADD <<EOF) as observable files: their content is visible at lint time. This lets other rules reason about what the image actually contains. For example: Content hidden inside a RUN echo ... > /path is not observable — tally can only see the shell source, not the final file — so downstream rules give up, and the rule set catches fewer real issues in your image.
  3. Predictable cache keys. COPY <<EOF participates in BuildKit’s content cache: the layer hash is derived from the literal heredoc body, so the layer hits cache whenever the content is unchanged. A RUN layer hashes over the command string, so any innocuous edit (trailing whitespace, reordered && clauses, comments) busts the cache even when the file it produces is identical. Splitting “create this file” into its own COPY keeps it out of the rebuild path of unrelated shell changes.
Secondary benefits: COPY --chmod sets permissions in a single layer (no follow-up chmod), and heredoc bodies are easier to read than escaped echo statements with manual \n separators.

Detected Patterns

  1. Simple file creation: echo "content" > /path/to/file
  2. printf with escape sequences: printf 'line1\nline2\n' > /path/to/file
  3. File creation with chmod: echo "x" > /file && chmod 0755 /file
  4. BuildKit heredoc piped to cat: RUN <<EOF cat > /path/to/file
  5. BuildKit heredoc piped to tee: RUN <<EOF tee /path/to/file
  6. Brace-grouped producers piped to tee: { echo a; printf 'b\n'; } | tee /path (mixes of echo / printf / cat <<EOF are supported)
  7. Consecutive RUN instructions writing to the same file
  8. Mixed commands with file creation in the middle (extracts just the file creation)
  9. Multiple distinct targets in one RUN: a single && chain that writes to several files is split into one COPY <<EOF per target. Any mkdir -p /parent whose target is a prefix of a COPY destination is dropped, since COPY auto-creates parent directories.

Examples

Before (violation)

RUN cat > /etc/nginx/nginx.conf <<'EOF'
worker_processes auto;
events { worker_connections 1024; }
http {
    server {
        listen 8080;
        location /healthz { return 200 "ok"; }
    }
}
EOF

RUN printf '#!/bin/sh\nexec nginx -g "daemon off;"\n' > /usr/local/bin/start-nginx && \
    chmod 0755 /usr/local/bin/start-nginx

RUN apt-get update && \
    echo "APP_ENV=production" > /etc/myapp.env && \
    echo "LOG_FORMAT=json" >> /etc/myapp.env && \
    apt-get clean

After (fixed with —fix —fix-unsafe)

COPY <<EOF /etc/nginx/nginx.conf
worker_processes auto;
events { worker_connections 1024; }
http {
    server {
        listen 8080;
        location /healthz { return 200 "ok"; }
    }
}
EOF

COPY --chmod=0755 <<EOF /usr/local/bin/start-nginx
#!/bin/sh
exec nginx -g "daemon off;"
EOF

RUN apt-get update

COPY <<EOF /etc/myapp.env
APP_ENV=production
LOG_FORMAT=json
EOF

RUN apt-get clean
Emitted COPY <<EOF blocks are surrounded by blank lines to keep the embedded file content readable; the fix doesn’t inject duplicate blanks when the source already has one adjacent to the replacement.

Multi-target with brace-grouped pipes

A single RUN that builds several config files via { echo ...; echo ...; } | tee /path chains — a common pattern in official images such as php:fpm — expands into one COPY <<EOF per destination. mkdir -p is absorbed when a later COPY writes under the same directory tree, and a leading set -ex is dropped entirely (shell options don’t cross RUN boundaries, so preserving it in a standalone RUN would be pure noise).

Before (violation)

RUN set -ex \
    && { \
        echo '[global]'; \
        echo 'daemonize = no'; \
    } | tee /usr/local/etc/php-fpm.d/www.conf \
    && mkdir -p /usr/local/php/php/auto_prepends \
    && { \
        echo '<?php'; \
        echo 'if (function_exists("uopz_allow_exit")) { uopz_allow_exit(true); }'; \
        echo '?>'; \
    } | tee /usr/local/php/php/auto_prepends/default_prepend.php \
    && { \
        echo 'FromLineOverride=YES'; \
        echo 'UseTLS=NO'; \
    } | tee /etc/ssmtp/ssmtp.conf \
    && { \
        echo '[PHP]'; \
        echo 'log_errors = On'; \
    } | tee /usr/local/etc/php/conf.d/php.ini

After (fixed with --fix --fix-unsafe)

COPY <<EOF /usr/local/etc/php-fpm.d/www.conf
[global]
daemonize = no
EOF

COPY <<EOF /usr/local/php/php/auto_prepends/default_prepend.php
<?php
if (function_exists("uopz_allow_exit")) { uopz_allow_exit(true); }
?>
EOF

COPY <<EOF /etc/ssmtp/ssmtp.conf
FromLineOverride=YES
UseTLS=NO
EOF

COPY <<EOF /usr/local/etc/php/conf.d/php.ini
[PHP]
log_errors = On
EOF

Limitations

  • Skips append operations (>>) since COPY would change semantics
  • Skips relative paths (only absolute paths like /etc/file)
  • Skips commands with shell variables not defined as ARG/ENV

Mount Handling

Since COPY doesn’t support --mount flags, the rule handles RUN mounts carefully:
Mount TypeBehavior
bindSkip - content might depend on bound files
cacheSafe if file target is outside cache path
tmpfsSafe if file target is outside tmpfs path
secretSafe if file target is outside secret path
sshSafe - no content dependency
When extracting file creation from mixed commands, mounts are preserved on the remaining RUN instructions.

Chmod Support

Preserves the original mode notation on COPY --chmod. COPY --chmod accepts both octal and symbolic modes (Dockerfile frontend 1.14+), so the fixer emits whichever form the source wrote:
  • Octal: chmod 755--chmod=755, chmod 0755--chmod=0755
  • Symbolic: chmod +x--chmod=+x, chmod u+x--chmod=u+x
Symbolic modes are copied verbatim — the fixer does not convert them to octal. That keeps the diff minimal and preserves the author’s intent.

Options

OptionTypeDefaultDescription
check-single-runbooleantrueCheck for single RUN instructions with file creation
check-consecutive-runsbooleantrueCheck for consecutive RUN instructions to same file

Configuration

[rules.tally.prefer-copy-heredoc]
severity = "style"
check-single-run = true
check-consecutive-runs = true

Rule Coordination

This rule takes priority over prefer-run-heredoc for pure file creation patterns. When both rules detect a pattern, prefer-copy-heredoc handles it.

References