PyPI - docker-evaluator - Versions diffs - 0.1.0__tar.gz - Mend

docker-evaluator 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

docker_evaluator-0.1.0/PKG-INFO ADDED Viewed

@@ -0,0 +1,125 @@
+Metadata-Version: 2.4
+Name: docker-evaluator
+Version: 0.1.0
+Summary: Generic code evaluation in isolated Docker containers
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+Requires-Dist: docker>=7.1.0
+Requires-Dist: python-dotenv>=1.0.1
+Provides-Extra: dev
+Requires-Dist: pytest>=7.0; extra == "dev"
+Requires-Dist: pytest-mock>=3.0; extra == "dev"
+Requires-Dist: ruff>=0.4.0; extra == "dev"
+# docker-evaluator
+A code evaluation backend for competitive programming judges. Runs untrusted submissions in isolated Docker containers, enforces time and memory limits, and checks output against expected results — similar to how Codeforces/CodeChef judge submissions.
+Supports Python 2/3, C, and C++.
+## Requirements
+- Docker (running and accessible to the current user)
+- Python 3.9+
+## Installation
+```bash
+pip install docker-evaluator
+```
+## Usage
+```python
+from docker_evaluator import DockerEvaluator
+evaluator = DockerEvaluator()
+result = evaluator.evaluate(
+    code='n = int(input()); print(n * 2)',
+    input="21",
+    expected_output="42",
+    language="py3",
+    time_limit=1,
+    memory_limit=256 * 1024,  # 256 MB in KB
+)
+print(result)
+# {'correct': True, 'details': 'OK (8ms)'}
+evaluator.close()
+```
+### Parameters
+| Parameter | Type | Default | Description |
+|---|---|---|---|
+| `code` | `str` | — | Submission source code |
+| `input` | `str` | — | Problem input (stdin or file content) |
+| `expected_output` | `str` | — | Correct output to compare against |
+| `language` | `str` | — | `"py3"`, `"py2"`, `"c"`, or `"cpp"` |
+| `time_limit` | `int` | — | Time limit in seconds |
+| `input_type` | `str` | `"stdin"` | `"stdin"` or `"file"` |
+| `file_io_name` | `str` | `""` | File name when using file I/O (e.g. `"input.txt"`) |
+| `memory_limit` | `int` | `1024` | Memory limit in KB (minimum enforced: 256 MB) |
+### Return value
+```python
+{"correct": bool, "details": str}
+```
+`details` is one of:
+- `OK (Xms)` — accepted
+- `Wrong Answer`
+- `Time Limit Exceeded`
+- `Memory Limit Exceeded`
+- `Runtime Error (exit code N)`
+- `Compilation Error`
+## Supported languages
+| Key | Language |
+|---|---|
+| `py3` | Python 3 |
+| `py2` | Python 2 |
+| `c` | C |
+| `cpp` | C++ |
+Docker images are built automatically on first use and cached for subsequent runs.
+## File I/O
+For problems that read/write files instead of stdin/stdout:
+```python
+result = evaluator.evaluate(
+    code=open("solution.cpp").read(),
+    input="5\n1 2 3 4 5",
+    expected_output="15",
+    language="cpp",
+    time_limit=2,
+    input_type="file",
+    file_io_name="input.txt",
+)
+```
+## Configuration
+Create a `.env` file in your working directory:
+| Variable | Default | Description |
+|---|---|---|
+| `KEEP_EVAL_CONTAINERS` | `0` | Set to `1` to keep containers after a run (useful for debugging) |
+| `ENVIRONMENT` | — | If set, also loads `.env.<ENVIRONMENT>` |
+## Compilation cache
+C and C++ submissions are cached by source hash — repeated evaluations of the same code skip recompilation. To clear the cache:
+```python
+from docker_evaluator import clear_cache
+clear_cache()
+```

docker_evaluator-0.1.0/README.md ADDED Viewed

@@ -0,0 +1,112 @@
+# docker-evaluator
+A code evaluation backend for competitive programming judges. Runs untrusted submissions in isolated Docker containers, enforces time and memory limits, and checks output against expected results — similar to how Codeforces/CodeChef judge submissions.
+Supports Python 2/3, C, and C++.
+## Requirements
+- Docker (running and accessible to the current user)
+- Python 3.9+
+## Installation
+```bash
+pip install docker-evaluator
+```
+## Usage
+```python
+from docker_evaluator import DockerEvaluator
+evaluator = DockerEvaluator()
+result = evaluator.evaluate(
+    code='n = int(input()); print(n * 2)',
+    input="21",
+    expected_output="42",
+    language="py3",
+    time_limit=1,
+    memory_limit=256 * 1024,  # 256 MB in KB
+)
+print(result)
+# {'correct': True, 'details': 'OK (8ms)'}
+evaluator.close()
+```
+### Parameters
+| Parameter | Type | Default | Description |
+|---|---|---|---|
+| `code` | `str` | — | Submission source code |
+| `input` | `str` | — | Problem input (stdin or file content) |
+| `expected_output` | `str` | — | Correct output to compare against |
+| `language` | `str` | — | `"py3"`, `"py2"`, `"c"`, or `"cpp"` |
+| `time_limit` | `int` | — | Time limit in seconds |
+| `input_type` | `str` | `"stdin"` | `"stdin"` or `"file"` |
+| `file_io_name` | `str` | `""` | File name when using file I/O (e.g. `"input.txt"`) |
+| `memory_limit` | `int` | `1024` | Memory limit in KB (minimum enforced: 256 MB) |
+### Return value
+```python
+{"correct": bool, "details": str}
+```
+`details` is one of:
+- `OK (Xms)` — accepted
+- `Wrong Answer`
+- `Time Limit Exceeded`
+- `Memory Limit Exceeded`
+- `Runtime Error (exit code N)`
+- `Compilation Error`
+## Supported languages
+| Key | Language |
+|---|---|
+| `py3` | Python 3 |
+| `py2` | Python 2 |
+| `c` | C |
+| `cpp` | C++ |
+Docker images are built automatically on first use and cached for subsequent runs.
+## File I/O
+For problems that read/write files instead of stdin/stdout:
+```python
+result = evaluator.evaluate(
+    code=open("solution.cpp").read(),
+    input="5\n1 2 3 4 5",
+    expected_output="15",
+    language="cpp",
+    time_limit=2,
+    input_type="file",
+    file_io_name="input.txt",
+)
+```
+## Configuration
+Create a `.env` file in your working directory:
+| Variable | Default | Description |
+|---|---|---|
+| `KEEP_EVAL_CONTAINERS` | `0` | Set to `1` to keep containers after a run (useful for debugging) |
+| `ENVIRONMENT` | — | If set, also loads `.env.<ENVIRONMENT>` |
+## Compilation cache
+C and C++ submissions are cached by source hash — repeated evaluations of the same code skip recompilation. To clear the cache:
+```python
+from docker_evaluator import clear_cache
+clear_cache()
+```

docker_evaluator-0.1.0/docker_evaluator/__init__.py ADDED Viewed

@@ -0,0 +1,6 @@
+from docker_evaluator.disk_helper import clear_cache
+from docker_evaluator.docker_helper import DockerHelper
+from docker_evaluator.env_helper import load_env_variables
+from docker_evaluator.evaluator import DockerEvaluator
+__all__ = ["DockerEvaluator", "DockerHelper", "clear_cache", "load_env_variables"]

docker_evaluator-0.1.0/docker_evaluator/disk_helper.py ADDED Viewed

@@ -0,0 +1,41 @@
+import hashlib
+import os
+import shutil
+import tempfile
+import threading
+_compile_locks = {}
+_compile_locks_mutex = threading.Lock()
+def get_compile_lock(cache_dir):
+    with _compile_locks_mutex:
+        if cache_dir not in _compile_locks:
+            _compile_locks[cache_dir] = threading.Lock()
+        return _compile_locks[cache_dir]
+_CACHE_BASE = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "compilation_cache"))
+def get_cache_dir(code, language):
+    code_hash = hashlib.sha256(code.encode()).hexdigest()
+    cache_dir = os.path.join(_CACHE_BASE, language, code_hash)
+    os.makedirs(cache_dir, exist_ok=True)
+    return cache_dir
+def clear_cache():
+    if os.path.exists(_CACHE_BASE):
+        shutil.rmtree(_CACHE_BASE)
+    print("Compilation cache cleared.")
+def get_temp_dir(files):
+    temp_dir = tempfile.mkdtemp()
+    test_data_dir = os.path.join(temp_dir, "test_data")
+    os.makedirs(test_data_dir, exist_ok=True)
+    for file in files:
+        with open(os.path.join(test_data_dir, file["name"]), "w", encoding="utf-8", errors="replace") as file_writer:
+            file_writer.write(file["content"])
+    return test_data_dir

docker_evaluator-0.1.0/docker_evaluator/docker_helper.py ADDED Viewed

@@ -0,0 +1,50 @@
+import os
+import docker
+from docker.errors import ContainerError
+from docker.types import LogConfig
+class DockerHelper:
+    def __init__(self):
+        self.client = docker.from_env()
+    def image_exists(self, image_tag):
+        images = self.client.images.list()
+        image_names = []
+        for image in images:
+            image_names.extend(image.tags)
+        return image_tag in image_names
+    def create_image(self, image_path, image_tag):
+        self.client.images.build(path=image_path, tag=image_tag)
+    def evaluate(self, image_name, volume, environment_variables, cpus=1, memory_limit_mb=None, cache_dir=None):
+        volumes = {volume: {"bind": "/test_data", "mode": "ro"}}
+        if cache_dir:
+            volumes[cache_dir] = {"bind": "/cache", "mode": "rw"}
+        # Read env at call time because env files may be loaded after helper init.
+        keep_eval_containers = os.getenv("KEEP_EVAL_CONTAINERS", "0") == "1"
+        try:
+            logs = self.client.containers.run(
+                image_name,
+                volumes=volumes,
+                detach=False,
+                environment=environment_variables,
+                remove=not keep_eval_containers,
+                log_config=LogConfig(type="json-file"),
+                nano_cpus=int(cpus * 1e9),
+                mem_limit=f"{memory_limit_mb}m" if memory_limit_mb else None,
+                memswap_limit=f"{memory_limit_mb}m" if memory_limit_mb else None,
+                network_disabled=True,
+                pids_limit=64,
+            )
+        except ContainerError as e:
+            if e.exit_status == 137:
+                return "Memory Limit Exceeded"
+            return f"Runtime Error (exit code {e.exit_status})"
+        output = logs.decode("utf-8", errors="replace").rstrip()
+        return output
+    def close(self):
+        self.client.close()

docker_evaluator-0.1.0/docker_evaluator/env_helper.py ADDED Viewed

@@ -0,0 +1,8 @@
+import os
+from dotenv import load_dotenv
+def load_env_variables():
+    load_dotenv(".env")
+    load_dotenv(f".env.{os.getenv('ENVIRONMENT')}")

docker_evaluator-0.1.0/docker_evaluator/evaluator.py ADDED Viewed

@@ -0,0 +1,49 @@
+from docker_evaluator.docker_helper import DockerHelper
+from docker_evaluator.env_helper import load_env_variables
+from docker_evaluator.language_helpers.c_helper.c_helper import CHelper
+from docker_evaluator.language_helpers.cpp_helper.cpp_helper import CppHelper
+from docker_evaluator.language_helpers.py2_helper.py2_helper import Py2Helper
+from docker_evaluator.language_helpers.py3_helper.py3_helper import Py3Helper
+class DockerEvaluator:
+    def __init__(self, docker_client=None):
+        load_env_variables()
+        self.docker_helper = docker_client
+        if docker_client is None:
+            self.docker_helper = DockerHelper()
+        self.language_helpers = [
+            CHelper(self.docker_helper),
+            CppHelper(self.docker_helper),
+            Py2Helper(self.docker_helper),
+            Py3Helper(self.docker_helper),
+        ]
+    def evaluate(
+        self, code, input, expected_output, language, time_limit, input_type="stdin", file_io_name="", memory_limit=1024
+    ):
+        # Enforce minimum memory limit of 256MB to avoid spurious segfaults on valid code.
+        # memory_limit is in KB, so 262144 KB = 256 MB.
+        MIN_MEMORY_KB = 256 * 1024
+        memory_limit = max(memory_limit, MIN_MEMORY_KB)
+        for language_helper in self.language_helpers:
+            if language_helper.language == language:
+                output = language_helper.evaluate(
+                    code, input, time_limit, input_type, file_io_name, memory_limit=memory_limit
+                )
+                # Extract container-side timing appended by entrypoint as last line
+                time_str = None
+                lines = output.split("\n")
+                if lines and lines[-1].startswith("__TIME__:"):
+                    time_str = lines[-1][len("__TIME__:") :]
+                    output = "\n".join(lines[:-1]).rstrip()
+                if "Limit Exceeded" in output or "Compilation Error" in output or "Runtime Error" in output:
+                    return {"correct": False, "details": output}
+                elif output.split() != expected_output.split():
+                    return {"correct": False, "details": "Wrong Answer"}
+                return {"correct": True, "details": f"OK ({time_str})" if time_str else "OK (time unavailable)"}
+    def close(self):
+        self.docker_helper.close()

docker_evaluator-0.1.0/docker_evaluator/language_helpers/__init__.py ADDED Viewed

File without changes

docker_evaluator-0.1.0/docker_evaluator/language_helpers/c_helper/Dockerfile ADDED Viewed

@@ -0,0 +1,7 @@
+FROM gcc:latest
+WORKDIR /app
+COPY entrypoint.sh .
+RUN chmod +x ./entrypoint.sh
+ENTRYPOINT ["./entrypoint.sh"]

docker_evaluator-0.1.0/docker_evaluator/language_helpers/c_helper/__init__.py ADDED Viewed

File without changes

docker_evaluator-0.1.0/docker_evaluator/language_helpers/c_helper/c_helper.py ADDED Viewed

@@ -0,0 +1,8 @@
+import os
+from docker_evaluator.language_helpers.language_helper import LanguageHelper
+class CHelper(LanguageHelper):
+    def __init__(self, docker_helper):
+        super().__init__(docker_helper, os.path.dirname(__file__), "c", "c", 1, cache_compilation=True)

docker_evaluator-0.1.0/docker_evaluator/language_helpers/c_helper/entrypoint.sh ADDED Viewed

@@ -0,0 +1,56 @@
+#!/bin/sh
+if [ -f /cache/main ]; then
+  cp /cache/main ./main
+else
+  compile_output=$(gcc -std=c99 -O2 -o ./main /test_data/target.c 2>&1)
+  if [ $? -ne 0 ]; then
+    echo "Compilation Error: $compile_output"
+    exit 0
+  fi
+  cp ./main /cache/main 2>/dev/null || true
+fi
+cp /test_data/target.in ./target.in
+start_ms=$(date +%s%3N)
+if [ "$INPUT_TYPE" = "file" ]; then
+  cp ./target.in ./${FILE_IO_NAME}.in
+  if [ -n "$MEMORY_LIMIT_KB" ] && [ "$MEMORY_LIMIT_KB" -gt 0 ] 2>/dev/null; then
+    timeout -k 1 ${TIME_LIMIT} sh -c "ulimit -v \"$MEMORY_LIMIT_KB\" && exec ./main"
+  else
+    timeout -k 1 ${TIME_LIMIT} ./main
+  fi
+  exit_code=$?
+else
+  if [ -n "$MEMORY_LIMIT_KB" ] && [ "$MEMORY_LIMIT_KB" -gt 0 ] 2>/dev/null; then
+    timeout -k 1 ${TIME_LIMIT} sh -c "ulimit -v \"$MEMORY_LIMIT_KB\" && exec ./main" < ./target.in > ./result.out
+  else
+    timeout -k 1 ${TIME_LIMIT} ./main < ./target.in > ./result.out
+  fi
+  exit_code=$?
+fi
+end_ms=$(date +%s%3N)
+elapsed_ms=$((end_ms - start_ms))
+if [ $exit_code -eq 124 ]; then
+  echo "Time Limit Exceeded"
+  exit 0
+fi
+if [ $exit_code -eq 137 ]; then
+  echo "Memory Limit Exceeded"
+  exit 0
+fi
+if [ $exit_code -ne 0 ]; then
+  echo "Runtime Error (exit code $exit_code)"
+  exit 0
+fi
+if [ "$INPUT_TYPE" = "file" ]; then
+  cat ./${FILE_IO_NAME}.out 2>/dev/null
+else
+  cat ./result.out
+fi
+printf '\n__TIME__:%sms\n' "${elapsed_ms}"
+exit 0

docker_evaluator-0.1.0/docker_evaluator/language_helpers/cpp_helper/Dockerfile ADDED Viewed

@@ -0,0 +1,7 @@
+FROM gcc:latest
+WORKDIR /app
+COPY entrypoint.sh .
+RUN chmod +x ./entrypoint.sh
+ENTRYPOINT ["./entrypoint.sh"]

docker_evaluator-0.1.0/docker_evaluator/language_helpers/cpp_helper/__init__.py ADDED Viewed

File without changes

docker_evaluator-0.1.0/docker_evaluator/language_helpers/cpp_helper/cpp_helper.py ADDED Viewed

@@ -0,0 +1,8 @@
+import os
+from docker_evaluator.language_helpers.language_helper import LanguageHelper
+class CppHelper(LanguageHelper):
+    def __init__(self, docker_helper):
+        super().__init__(docker_helper, os.path.dirname(__file__), "cpp", "cpp", 1, cache_compilation=True)

docker_evaluator-0.1.0/docker_evaluator/language_helpers/cpp_helper/entrypoint.sh ADDED Viewed

@@ -0,0 +1,56 @@
+#!/bin/sh
+if [ -f /cache/main ]; then
+  cp /cache/main ./main
+else
+  compile_output=$(g++ -std=c++20 -O2 -o ./main /test_data/target.cpp 2>&1)
+  if [ $? -ne 0 ]; then
+    echo "Compilation Error: $compile_output"
+    exit 0
+  fi
+  cp ./main /cache/main 2>/dev/null || true
+fi
+cp /test_data/target.in ./target.in
+start_ms=$(date +%s%3N)
+if [ "$INPUT_TYPE" = "file" ]; then
+  cp ./target.in ./${FILE_IO_NAME}.in
+  if [ -n "$MEMORY_LIMIT_KB" ] && [ "$MEMORY_LIMIT_KB" -gt 0 ] 2>/dev/null; then
+    timeout -k 1 ${TIME_LIMIT} sh -c "ulimit -v \"$MEMORY_LIMIT_KB\" && exec ./main"
+  else
+    timeout -k 1 ${TIME_LIMIT} ./main
+  fi
+  exit_code=$?
+else
+  if [ -n "$MEMORY_LIMIT_KB" ] && [ "$MEMORY_LIMIT_KB" -gt 0 ] 2>/dev/null; then
+    timeout -k 1 ${TIME_LIMIT} sh -c "ulimit -v \"$MEMORY_LIMIT_KB\" && exec ./main" < ./target.in > ./result.out
+  else
+    timeout -k 1 ${TIME_LIMIT} ./main < ./target.in > ./result.out
+  fi
+  exit_code=$?
+fi
+end_ms=$(date +%s%3N)
+elapsed_ms=$((end_ms - start_ms))
+if [ $exit_code -eq 124 ]; then
+  echo "Time Limit Exceeded"
+  exit 0
+fi
+if [ $exit_code -eq 137 ]; then
+  echo "Memory Limit Exceeded"
+  exit 0
+fi
+if [ $exit_code -ne 0 ]; then
+  echo "Runtime Error (exit code $exit_code)"
+  exit 0
+fi
+if [ "$INPUT_TYPE" = "file" ]; then
+  cat ./${FILE_IO_NAME}.out 2>/dev/null
+else
+  cat ./result.out
+fi
+printf '\n__TIME__:%sms\n' "${elapsed_ms}"
+exit 0

docker_evaluator-0.1.0/docker_evaluator/language_helpers/language_helper.py ADDED Viewed

@@ -0,0 +1,83 @@
+import os
+from docker_evaluator.disk_helper import get_cache_dir, get_compile_lock, get_temp_dir
+class LanguageHelper:
+    def __init__(
+        self,
+        docker_helper,
+        docker_context_path,
+        language,
+        file_extension,
+        language_time_limit_multiplier,
+        memory_overhead_mb=32,
+        cache_compilation=False,
+    ):
+        docker_image_name = f"docker-evaluator-{language}"
+        self.docker_helper = docker_helper
+        self.docker_image_name = docker_image_name
+        self.language = language
+        self.language_time_limit_multiplier = language_time_limit_multiplier
+        self.memory_overhead_mb = memory_overhead_mb
+        self.cache_compilation = cache_compilation
+        self.file_extension = file_extension
+        self.initialize(docker_context_path, docker_image_name)
+    def initialize(self, docker_context_path, docker_image_name):
+        docker_image_tag = f"{docker_image_name}:latest"
+        if not self.docker_helper.image_exists(docker_image_tag):
+            print(f"Image {docker_image_tag} not found, building...")
+            self.docker_helper.create_image(docker_context_path, docker_image_tag)
+            print(f"Image {docker_image_tag} built successfully.")
+    def evaluate(self, code, code_input, time_limit, input_type="stdin", file_io_name="", memory_limit=1024):
+        files = [
+            {"name": f"target.{self.file_extension}", "content": code},
+            {"name": "target.in", "content": code_input},
+        ]
+        # Add grace to compensate for Docker/WSL2 scheduling jitter on Windows.
+        # timeout measures wall clock, not CPU time, so the process may get less
+        # than a full CPU-second per wall-second under load.
+        GRACE_S = 0.2
+        effective_time_limit = time_limit * self.language_time_limit_multiplier + GRACE_S
+        environment_variables = {
+            "TIME_LIMIT": effective_time_limit,
+            "LANG": "C.UTF-8",
+            "LC_ALL": "C.UTF-8",
+            "INPUT_TYPE": input_type,
+            "FILE_IO_NAME": file_io_name or "",
+            "MEMORY_LIMIT_KB": str(memory_limit),
+        }
+        memory_limit_mb = memory_limit // 1024
+        total_memory_mb = memory_limit_mb + self.memory_overhead_mb
+        # C/C++ compilation can briefly use much more memory than runtime.
+        # Keep a minimum container budget for compile-heavy languages, while
+        # entrypoints enforce the requested runtime limit with ulimit.
+        compile_safe_memory_mb = max(total_memory_mb, 1536) if self.cache_compilation else total_memory_mb
+        temp_dir = get_temp_dir(files)
+        cache_dir = get_cache_dir(code, self.language) if self.cache_compilation else None
+        cache_file = os.path.join(cache_dir, "main") if cache_dir else None
+        cache_status = "disabled" if not cache_dir else ("hit" if os.path.exists(cache_file) else "miss")
+        print(
+            f"env: {environment_variables}, time: {time_limit}s x{self.language_time_limit_multiplier} +{GRACE_S}s grace = {effective_time_limit}s, memory: {memory_limit}KB ({memory_limit_mb}MB) + {self.memory_overhead_mb}MB overhead = {total_memory_mb}MB, container: {compile_safe_memory_mb}MB, cache: {cache_status}"
+        )
+        # On a cache miss, serialize via a per-hash lock so only one container
+        # compiles at a time. This prevents simultaneous writes to the same
+        # Windows volume path from hanging. Cache hits run without the lock.
+        if cache_dir and not os.path.exists(cache_file):
+            with get_compile_lock(cache_dir):
+                return self.docker_helper.evaluate(
+                    self.docker_image_name,
+                    temp_dir,
+                    environment_variables=environment_variables,
+                    memory_limit_mb=compile_safe_memory_mb,
+                    cache_dir=cache_dir,
+                )
+        return self.docker_helper.evaluate(
+            self.docker_image_name,
+            temp_dir,
+            environment_variables=environment_variables,
+            memory_limit_mb=compile_safe_memory_mb,
+            cache_dir=cache_dir,
+        )

docker_evaluator-0.1.0/docker_evaluator/language_helpers/py2_helper/Dockerfile ADDED Viewed

@@ -0,0 +1,7 @@
+FROM python:2
+WORKDIR /app
+COPY entrypoint.sh .
+RUN chmod +x ./entrypoint.sh
+ENTRYPOINT ["./entrypoint.sh"]

docker_evaluator-0.1.0/docker_evaluator/language_helpers/py2_helper/__init__.py ADDED Viewed

File without changes