PyPI - xmlpydict - Versions diffs - 0.0.7__tar.gz → 0.0.11__tar.gz - Mend

xmlpydict 0.0.7tar.gz → 0.0.11tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

xmlpydict-0.0.11/PKG-INFO +111 -0
xmlpydict-0.0.11/README.md +84 -0
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/pyproject.toml +3 -3
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/setup.py +2 -2
xmlpydict-0.0.11/src/xmlparse.cpp +222 -0
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/tests/test_parse.py +88 -20
xmlpydict-0.0.11/xmlpydict/__init__.py +45 -0
xmlpydict-0.0.11/xmlpydict.egg-info/PKG-INFO +111 -0
xmlpydict-0.0.11/xmlpydict.egg-info/SOURCES.txt +13 -0
{xmlpydict-0.0.7/src → xmlpydict-0.0.11}/xmlpydict.egg-info/requires.txt +1 -1
xmlpydict-0.0.11/xmlpydict.egg-info/top_level.txt +2 -0
xmlpydict-0.0.7/PKG-INFO +0 -70
xmlpydict-0.0.7/README.md +0 -44
xmlpydict-0.0.7/src/xmlparse.cpp +0 -378
xmlpydict-0.0.7/src/xmlparse.py +0 -68
xmlpydict-0.0.7/src/xmlpydict.egg-info/PKG-INFO +0 -70
xmlpydict-0.0.7/src/xmlpydict.egg-info/SOURCES.txt +0 -14
xmlpydict-0.0.7/src/xmlpydict.egg-info/top_level.txt +0 -2
xmlpydict-0.0.7/tests/test.py +0 -24
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/LICENSE +0 -0
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/MANIFEST.in +0 -0
{xmlpydict-0.0.7 → xmlpydict-0.0.11}/setup.cfg +0 -0
{xmlpydict-0.0.7/src → xmlpydict-0.0.11}/xmlpydict.egg-info/dependency_links.txt +0 -0

xmlpydict-0.0.11/PKG-INFO ADDED Viewed

@@ -0,0 +1,111 @@
+Metadata-Version: 2.4
+Name: xmlpydict
+Version: 0.0.11
+Summary: xml to dictionary tool for python
+Author-email: Matthew Taylor <matthew.taylor.andre@gmail.com>
+Project-URL: Homepage, https://github.com/MatthewAndreTaylor/xml-to-pydict
+Keywords: xml,dictionary
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3 :: Only
+Classifier: Programming Language :: Python :: 3.8
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: Implementation :: CPython
+Classifier: Topic :: Software Development :: Libraries :: Python Modules
+Classifier: Topic :: Text Processing :: Markup :: XML
+Requires-Python: >=3.7
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Provides-Extra: tests
+Requires-Dist: pytest; extra == "tests"
+Requires-Dist: requests; extra == "tests"
+Dynamic: license-file
+# xmlpydict 📑
+[![XML Tests](https://github.com/MatthewAndreTaylor/xml-to-pydict/actions/workflows/tests.yml/badge.svg)](https://github.com/MatthewAndreTaylor/xml-to-pydict/actions/workflows/tests.yml)
+[![PyPI versions](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/MatthewAndreTaylor/xml-to-pydict)
+[![PyPI](https://img.shields.io/pypi/v/xmlpydict.svg)](https://pypi.org/project/xmlpydict/)
+## Requirements
+- `python 3.8+`
+## Installation
+To install xmlpydict, using pip:
+```bash
+pip install xmlpydict
+```
+## Quickstart
+```py
+>>> from xmlpydict import parse
+>>> parse("<package><xmlpydict language='python'/></package>")
+{'package': {'xmlpydict': {'@language': 'python'}}}
+>>> parse("<person name='Matthew'>Hello!</person>")
+{'person': {'@name': 'Matthew', '#text': 'Hello!'}}
+```
+## Goals
+Create a consistent parsing strategy between XML and Python dictionaries using the specification found [here](https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html). `xmlpydict` focuses on speed; see the benchmarks below.
+<img width="256" alt="small_xml_document" src="https://github.com/user-attachments/assets/0248a408-6bb6-4790-bd0f-f90537e2f21a" />
+<img width="256" alt="large_xml_document" src="https://github.com/user-attachments/assets/539a2a69-f475-46a5-bffc-1e8805a5a5e7" />
+### xmlpydict supports the following
+[CDataSection](https://www.w3.org/TR/xml/#sec-cdata-sect):  CDATA Sections are stored as {'#text': CData}.
+[Comments](https://www.w3.org/TR/xml/#sec-comments):  Comments are tokenized for corectness, but have no effect in what is returned.
+[Element Tags](https://www.w3.org/TR/xml/#sec-starttags):  Allows for duplicate attributes, however only the latest defined will be taken.
+[Characters](https://www.w3.org/TR/xml/#charsets):  Similar to CDATA text is stored as {'#text': Char} , however this text is stripped.
+```py
+# Empty tags are containers
+>>> from xmlpydict import parse
+>>> parse("<a></a>")
+{'a': None}
+>>> parse("<a/>")
+{'a': None}
+>>> parse("<a/>").get('href')
+None
+```
+### Attribute prefixing
+```py
+# Change prefix from default "@" with keyword argument attr_prefix
+>>> from xmlpydict import parse
+>>> parse('<p width="10" height="5"></p>', attr_prefix="$")
+{"p": {"$width": "10", "$height": "5"}}
+```
+### Exceptions
+```py
+# Grammar and structure of the xml_content is checked while parsing
+>>> from xmlpydict import parse
+>>> parse("<a></ a>")
+xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 5
+```
+### Unsupported
+Prolog / Enforcing Document Type Definition and Element Type Declarations
+Entity Referencing
+Namespaces

xmlpydict-0.0.11/README.md ADDED Viewed

@@ -0,0 +1,84 @@
+# xmlpydict 📑
+[![XML Tests](https://github.com/MatthewAndreTaylor/xml-to-pydict/actions/workflows/tests.yml/badge.svg)](https://github.com/MatthewAndreTaylor/xml-to-pydict/actions/workflows/tests.yml)
+[![PyPI versions](https://img.shields.io/badge/python-3.8%2B-blue)](https://github.com/MatthewAndreTaylor/xml-to-pydict)
+[![PyPI](https://img.shields.io/pypi/v/xmlpydict.svg)](https://pypi.org/project/xmlpydict/)
+## Requirements
+- `python 3.8+`
+## Installation
+To install xmlpydict, using pip:
+```bash
+pip install xmlpydict
+```
+## Quickstart
+```py
+>>> from xmlpydict import parse
+>>> parse("<package><xmlpydict language='python'/></package>")
+{'package': {'xmlpydict': {'@language': 'python'}}}
+>>> parse("<person name='Matthew'>Hello!</person>")
+{'person': {'@name': 'Matthew', '#text': 'Hello!'}}
+```
+## Goals
+Create a consistent parsing strategy between XML and Python dictionaries using the specification found [here](https://www.xml.com/pub/a/2006/05/31/converting-between-xml-and-json.html). `xmlpydict` focuses on speed; see the benchmarks below.
+<img width="256" alt="small_xml_document" src="https://github.com/user-attachments/assets/0248a408-6bb6-4790-bd0f-f90537e2f21a" />
+<img width="256" alt="large_xml_document" src="https://github.com/user-attachments/assets/539a2a69-f475-46a5-bffc-1e8805a5a5e7" />
+### xmlpydict supports the following
+[CDataSection](https://www.w3.org/TR/xml/#sec-cdata-sect):  CDATA Sections are stored as {'#text': CData}.
+[Comments](https://www.w3.org/TR/xml/#sec-comments):  Comments are tokenized for corectness, but have no effect in what is returned.
+[Element Tags](https://www.w3.org/TR/xml/#sec-starttags):  Allows for duplicate attributes, however only the latest defined will be taken.
+[Characters](https://www.w3.org/TR/xml/#charsets):  Similar to CDATA text is stored as {'#text': Char} , however this text is stripped.
+```py
+# Empty tags are containers
+>>> from xmlpydict import parse
+>>> parse("<a></a>")
+{'a': None}
+>>> parse("<a/>")
+{'a': None}
+>>> parse("<a/>").get('href')
+None
+```
+### Attribute prefixing
+```py
+# Change prefix from default "@" with keyword argument attr_prefix
+>>> from xmlpydict import parse
+>>> parse('<p width="10" height="5"></p>', attr_prefix="$")
+{"p": {"$width": "10", "$height": "5"}}
+```
+### Exceptions
+```py
+# Grammar and structure of the xml_content is checked while parsing
+>>> from xmlpydict import parse
+>>> parse("<a></ a>")
+xml.parsers.expat.ExpatError: not well-formed (invalid token): line 1, column 5
+```
+### Unsupported
+Prolog / Enforcing Document Type Definition and Element Type Declarations
+Entity Referencing
+Namespaces

{xmlpydict-0.0.7 → xmlpydict-0.0.11}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "xmlpydict"
-version = "0.0.7"
+version = "0.0.11"
 description="xml to dictionary tool for python"
 authors = [
     {name = "Matthew Taylor", email = "matthew.taylor.andre@gmail.com"},
@@ -18,13 +18,13 @@ classifiers = [
     "License :: OSI Approved :: MIT License",
     "Programming Language :: Python :: 3",
     "Programming Language :: Python :: 3 :: Only",
-    "Programming Language :: Python :: 3.7",
     "Programming Language :: Python :: 3.8",
     "Programming Language :: Python :: 3.9",
     "Programming Language :: Python :: 3.10",
     "Programming Language :: Python :: 3.11",
     "Programming Language :: Python :: Implementation :: CPython",
     "Topic :: Software Development :: Libraries :: Python Modules",
+    "Topic :: Text Processing :: Markup :: XML",
 ]
 [project.readme]
@@ -32,4 +32,4 @@ file = "README.md"
 content-type = "text/markdown"
 [project.optional-dependencies]
-tests = [ "pytest", "xmltodict" ]
+tests = [ "pytest", "requests" ]

{xmlpydict-0.0.7 → xmlpydict-0.0.11}/setup.py RENAMED Viewed

@@ -16,8 +16,8 @@ class build_ext(build_ext_orig):
 setup(
     include_package_data=True,
     ext_modules=[
-        Extension("xmlpydict", ["src/xmlparse.cpp"]),
+        Extension("pyxmlhandler", ["src/xmlparse.cpp"]),
     ],
     cmdclass={"build_ext": build_ext},
-    package_data={"xmlpydict": ["py.typed"], "": ["xmlpydict.pyi"]},
+    packages=["xmlpydict"],
 )

xmlpydict-0.0.11/src/xmlparse.cpp ADDED Viewed

@@ -0,0 +1,222 @@
+/**
+ * Copyright (c) 2023 Matthew Andre Taylor
+ */
+#include <Python.h>
+#include <stdio.h>
+#include <cctype>
+#include <vector>
+static PyObject* strip(PyObject* s_obj) {
+    Py_ssize_t start = 0;
+    Py_ssize_t end = PyUnicode_GetLength(s_obj);
+    while (start < end && std::isspace(PyUnicode_ReadChar(s_obj, start))) {
+      ++start;
+    }
+    while (end > start && std::isspace(PyUnicode_ReadChar(s_obj, end - 1))) {
+      --end;
+    }
+    return PyUnicode_Substring(s_obj, start, end);
+}
+typedef struct {
+    PyObject_HEAD PyObject* item;          // current dict
+    PyObject* data;        // character data buffer
+    std::vector<PyObject*> item_stack;
+    std::vector<PyObject*> data_stack;
+    PyObject* attr_prefix;
+    PyObject* cdata_key;
+} PyDictHandler;
+static PyObject* PyDictHandler_new(PyTypeObject* type, PyObject* args,
+                            PyObject* kwargs) {
+    PyDictHandler* self;
+    self = (PyDictHandler*)type->tp_alloc(type, 0);
+    return (PyObject*)self;
+}
+static int PyDictHandler_init(PyDictHandler* self, PyObject* args,
+                          PyObject* kwargs) {
+    const char* attr_prefix = "@";
+    const char* cdata_key = "#text";
+    static char* kwlist[] = {"attr_prefix", "cdata_key", NULL};
+    if (!PyArg_ParseTupleAndKeywords(args, kwargs, "|ss", kwlist,
+                                     &attr_prefix, &cdata_key))
+        return -1;
+    self->item = Py_None;
+    self->data = PyUnicode_New(0, 127); // empty string
+    self->attr_prefix = PyUnicode_FromString(attr_prefix);
+    self->cdata_key = PyUnicode_FromString(cdata_key);
+    return 0;
+}
+static PyObject* characters(PyDictHandler* self, PyObject* data_obj) {
+    PyUnicode_Append(&self->data, data_obj);
+    Py_RETURN_NONE;
+}
+static PyObject* startElement(PyDictHandler* self, PyObject* args) {
+    self->item_stack.push_back(self->item);
+    self->data_stack.push_back(self->data);
+    self->data = PyUnicode_New(0, 127); // reset data buffer
+    const char* name;
+    PyObject* attrs;
+    if (!PyArg_ParseTuple(args, "sO", &name, &attrs)) {
+        return NULL;
+    }
+    if (!PyDict_Check(attrs) || PyDict_Size(attrs) == 0) {
+        self->item = Py_None;
+        Py_RETURN_NONE;
+    }
+    PyObject* newDict = PyDict_New();
+    PyObject *key, *value;
+    Py_ssize_t pos = 0;
+    while (PyDict_Next(attrs, &pos, &key, &value)) {
+        PyObject* prefixed_key = PyUnicode_Concat(self->attr_prefix, key);
+        PyDict_SetItem(newDict, prefixed_key, value);
+    }
+    self->item = newDict;
+    Py_RETURN_NONE;
+}
+static PyObject* updateChildren(PyObject*& target, PyObject* key, PyObject* value) {
+    if (target == Py_None) {
+        target = PyDict_New();
+    }
+    if (!PyDict_Contains(target, key)) {
+        PyDict_SetItem(target, key, value);
+    } else {
+        PyObject* existing = PyDict_GetItem(target, key);
+        if (PyList_Check(existing)) {
+            PyList_Append(existing, value);
+        } else {
+            PyObject* newList = PyList_New(2);
+            PyList_SetItem(newList, 0, existing);
+            PyList_SetItem(newList, 1, value);
+            PyDict_SetItem(target, key, newList);
+        }
+    }
+    return target;
+}
+static PyObject* endElement(PyDictHandler* self, PyObject* name_obj) {
+    if (!self->data_stack.empty()) {
+        PyObject* temp_data = strip(self->data);
+        bool has_data = (PyUnicode_GetLength(temp_data) > 0);
+        PyObject* py_data = has_data ? temp_data : Py_None;
+        PyObject* temp_item = self->item;
+        self->item = self->item_stack.back();
+        self->data = self->data_stack.back();
+        self->item_stack.pop_back();
+        self->data_stack.pop_back();
+        if (temp_item != Py_None) {
+            if (has_data) {
+                PyDict_SetItem(temp_item, self->cdata_key, py_data);
+            }
+            temp_item = PyDict_Copy(temp_item);
+            self->item = updateChildren(self->item, name_obj, temp_item);
+        }
+        else {
+            self->item = updateChildren(self->item, name_obj, py_data);
+        }
+    }
+    Py_RETURN_NONE;
+}
+static PyMethodDef PyDictHandler_methods[] = {
+    {"characters", (PyCFunction)characters, METH_O, "Handle character data"},
+    {"startElement", (PyCFunction)startElement, METH_VARARGS, "Handle start of an element"},
+    {"endElement", (PyCFunction)endElement, METH_O, "Handle end of an element"},
+    {NULL, NULL, 0, NULL}
+};
+static PyObject* PyDictHandler_get_item(PyDictHandler *self, void *closure)
+{
+    Py_INCREF(self->item);
+    return self->item;
+}
+static PyGetSetDef PyDictHandler_getset[] = {
+    {
+        "item",                                   /* name */
+        (getter)PyDictHandler_get_item,           /* get */
+        NULL,           /* set */
+        NULL,                    /* doc */
+        NULL                                      /* closure */
+    },
+    {NULL}  /* Sentinel */
+};
+static PyTypeObject PyDictHandlerType = {
+    PyVarObject_HEAD_INIT(NULL, 0) "pyxmlhandler._PyDictHandler", // tp_name
+    sizeof(PyDictHandler),                                    // tp_basicsize
+    0,                                                        // tp_itemsize
+    0,                                                        // tp_dealloc
+    0,                                                        // tp_vectorcall_offset
+    0,                                                        // tp_getattr
+    0,                                                        // tp_setattr
+    0,                                                        // tp_as_async
+    0,                                                        // tp_repr
+    0,                                                        // tp_as_number
+    0,                                                        // tp_as_sequence
+    0,                                                        // tp_as_mapping
+    0,                                                        // tp_hash
+    0,                                                        // tp_call
+    0,                                                        // tp_str
+    0,                                                        // tp_getattro
+    0,                                                        // tp_setattro
+    0,                                                        // tp_as_buffer
+    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,                // tp_flags
+    "Handler that converts XML to Python dict",               // tp_doc
+    0,                                                        // tp_traverse
+    0,                                                        // tp_clear
+    0,                                                        // tp_richcompare
+    0,                                                        // tp_weaklistoffset
+    0,                                                        // tp_iter
+    0,                                                        // tp_iternext
+    PyDictHandler_methods,                                    // tp_methods
+    0,                                                        // tp_members
+    PyDictHandler_getset,                                     // tp_getset
+    0,                                                        // tp_base
+    0,                                                        // tp_dict
+    0,                                                        // tp_descr_get
+    0,                                                        // tp_descr_set
+    0,                                                        // tp_dictoffset
+    (initproc)PyDictHandler_init,                             // tp_init
+    0,                                                        // tp_alloc
+    PyDictHandler_new,                                        // tp_new
+};
+static PyModuleDef pyxmlhandlermodule = {
+    PyModuleDef_HEAD_INIT,
+    "pyxmlhandler",
+    "Module that provides XML to Python dict parsing",
+    -1,
+    NULL, NULL, NULL, NULL, NULL
+};
+PyMODINIT_FUNC PyInit_pyxmlhandler(void) {
+    PyObject* m;
+    if (PyType_Ready(&PyDictHandlerType) < 0)
+        return NULL;
+    m = PyModule_Create(&pyxmlhandlermodule);
+    if (m == NULL)
+        return NULL;
+    Py_INCREF(&PyDictHandlerType);
+    PyModule_AddObject(m, "_PyDictHandler", (PyObject*)&PyDictHandlerType);
+    return m;
+}

{xmlpydict-0.0.7 → xmlpydict-0.0.11}/tests/test_parse.py RENAMED Viewed

@@ -1,13 +1,14 @@
 import pytest
-from xmlpydict import parse
 import json
+from xmlpydict import parse
 def test_simple():
-    assert parse("") == {}
-    assert parse("<p/>") == {"p": {}}
-    assert parse("<p></p>") == {"p": {}}
+    assert parse("<p/>") == {"p": None}
+    assert parse("<p></p>") == {"p": None}
     assert parse('<p width="10"></p>') == {"p": {"@width": "10"}}
+    assert parse("<p>Hello</p>") == {"p": "Hello"}
     assert parse('<p width="10">Hello World</p>') == {
         "p": {"@width": "10", "#text": "Hello World"}
     }
@@ -21,7 +22,18 @@ def test_simple():
         "p": {"@width": "10", "@height": "20"}
     }
     assert parse("<p>Hey <b>bold</b>There</p>") == {
-        "p": {"#text": "HeyThere", "b": {"#text": "bold"}}
+        "p": {"#text": "Hey There", "b": "bold"}
+    }
+    assert parse("<p>Hey <b>bold</b>There <b>bold</b>Buddy </p>") == {
+        "p": {"#text": "Hey There Buddy", "b": ["bold", "bold"]}
+    }
+    assert parse("<p>Hey <b/>There Buddy</p>") == {
+        "p": {"#text": "Hey There Buddy", "b": None}
+    }
+    assert parse("<p>Hey <b/>There Buddy <b/> </p>") == {
+        "p": {"#text": "Hey There Buddy", "b": [None, None]}
     }
     assert (
@@ -66,15 +78,18 @@ def test_simple():
     )
-def test_nested():
-    assert parse("<book><p/></book> ") == {"book": {"p": {}}}
-    assert parse("<book><p></p></book>") == {"book": {"p": {}}}
-    assert parse("<book><p></p></book><card/>") == {"book": {"p": {}}, "card": {}}
-    assert parse("<pizza></pizza><book><p></p></book><card/>") == {
-        "pizza": {},
-        "book": {"p": {}},
-        "card": {},
+def test_cdata():
+    assert parse("<content><![CDATA[<p>This is a paragraph</p>]]></content>") == {
+        "content": "<p>This is a paragraph</p>"
     }
+    assert parse(
+        "<special_chars><![CDATA[$ ^ * % & <> () + - + ` ~]]></special_chars>"
+    ) == {"special_chars": "$ ^ * % & <> () + - + ` ~"}
+def test_nested():
+    assert parse("<book><p/></book> ") == {"book": {"p": None}}
+    assert parse("<book><p></p></book>") == {"book": {"p": None}}
 def test_list():
@@ -89,12 +104,20 @@ def test_list():
 def test_comment():
-    assert parse("<!-- simple comment -->") == {}
+    assert parse("<p/><!-- simple comment -->") == {"p": None}
     comment = """<world>
   <!-- $comment+++@python -->
   <lake>Content</lake>
 </world>"""
-    assert parse(comment) == {"world": {"lake": {"#text": "Content"}}}
+    assert parse(comment) == {"world": {"lake": "Content"}}
+    multiple_comments = """<book>
+    <!-- Comment 0 -->
+    <!-- Comment 1 -->
+    <lines>510</lines>
+    <!-- Comment 2 -->
+    <!-- -->
+</book>"""
+    assert parse(multiple_comments) == {"book": {"lines": "510"}}
 def test_files():
@@ -269,16 +292,61 @@ def test_files():
 def test_exception():
     xml_strings = [
-        "< p/>",
-        "<p>",
-        "<p/ >",
         "<p height'10'/>",
         "<p height='10'width='5'/>",
-        "<p width='5/>",
         "<p width=5'/>",
-        "</p>",
         "<pwidth='5'/>",
+        "<!---->",
+        "<a></p>",
+        "<></>",
+        "</>",
+        "<",
+        ">",
+        "<nested></p></nested>",
     ]
     for xml_str in xml_strings:
         with pytest.raises(Exception):
             parse(xml_str)
+def test_prefix():
+    assert parse("<p></p>", attr_prefix="$") == {"p": None}
+    assert parse('<p width="10"></p>', attr_prefix="$") == {"p": {"$width": "10"}}
+    assert parse('<p width="10" height="5"></p>', attr_prefix="$") == {
+        "p": {"$width": "10", "$height": "5"}
+    }
+    assert parse('<p width="10" height="5"></p>', attr_prefix="$$$$$$$$$") == {
+        "p": {"$$$$$$$$$width": "10", "$$$$$$$$$height": "5"}
+    }
+    assert parse('<p width="10" height="5"></p>', attr_prefix="") == {
+        "p": {"width": "10", "height": "5"}
+    }
+def test_document():
+    s = """<?xml version="1.0" encoding="UTF-8"?><repository>
+  <project pypi="xmlpydict">
+    <title>XML document parser</title>
+    <author>Matthew Taylor</author>
+  </project>
+  <project pypi="blank">
+    <title>Test project</title>
+    <author>Matthew Taylor</author>
+  </project>
+</repository>"""
+    assert parse(s) == {
+        "repository": {
+            "project": [
+                {
+                    "@pypi": "xmlpydict",
+                    "title": "XML document parser",
+                    "author": "Matthew Taylor",
+                },
+                {
+                    "@pypi": "blank",
+                    "title": "Test project",
+                    "author": "Matthew Taylor",
+                },
+            ]
+        }
+    }

xmlpydict-0.0.11/xmlpydict/__init__.py ADDED Viewed

@@ -0,0 +1,45 @@
+from pyxmlhandler import _PyDictHandler
+from xml.parsers import expat
+def parse(xml_content, attr_prefix: str = "@", cdata_key: str = "#text") -> dict:
+    """
+    Parse XML content into a python dictionary.
+    Args:
+        xml_content: The XML content to be parsed.
+        attr_prefix: The prefix to use for attributes in the resulting dictionary.
+        cdata_key: The key to use for character data in the resulting dictionary.
+    Returns:
+        A dictionary representation of the XML content.
+    """
+    handler = _PyDictHandler(attr_prefix=attr_prefix, cdata_key=cdata_key)
+    parser = expat.ParserCreate()
+    parser.CharacterDataHandler = handler.characters
+    parser.StartElementHandler = handler.startElement
+    parser.EndElementHandler = handler.endElement
+    parser.Parse(xml_content, True)
+    return handler.item
+def parse_file(file_path, attr_prefix: str = "@", cdata_key: str = "#text") -> dict:
+    """
+    Parse an XML file into a python dictionary.
+    Args:
+        file_path: The path to the XML file to be parsed.
+        attr_prefix: The prefix to use for attributes in the resulting dictionary.
+        cdata_key: The key to use for character data in the resulting dictionary.
+    Returns:
+        A dictionary representation of the XML file content.
+    """
+    handler = _PyDictHandler(attr_prefix=attr_prefix, cdata_key=cdata_key)
+    parser = expat.ParserCreate()
+    parser.CharacterDataHandler = handler.characters
+    parser.StartElementHandler = handler.startElement
+    parser.EndElementHandler = handler.endElement
+    with open(file_path, "r", encoding="utf-8") as f:
+        parser.ParseFile(f)
+    return handler.item

xmlpydict 0.0.7__tar.gz → 0.0.11__tar.gz

xmlpydict 0.0.7tar.gz → 0.0.11tar.gz