ljavalang 2.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. ljavalang-2.0.1/LICENSE.txt +19 -0
  2. ljavalang-2.0.1/PKG-INFO +279 -0
  3. ljavalang-2.0.1/README.md +252 -0
  4. ljavalang-2.0.1/README.rst +158 -0
  5. ljavalang-2.0.1/javalang/__init__.py +8 -0
  6. ljavalang-2.0.1/javalang/ast.py +86 -0
  7. ljavalang-2.0.1/javalang/javadoc.py +120 -0
  8. ljavalang-2.0.1/javalang/parse.py +53 -0
  9. ljavalang-2.0.1/javalang/parser.py +2884 -0
  10. ljavalang-2.0.1/javalang/test/__init__.py +0 -0
  11. ljavalang-2.0.1/javalang/test/source/package-info/AnnotationJavadoc.java +5 -0
  12. ljavalang-2.0.1/javalang/test/source/package-info/AnnotationOnly.java +2 -0
  13. ljavalang-2.0.1/javalang/test/source/package-info/JavadocAnnotation.java +5 -0
  14. ljavalang-2.0.1/javalang/test/source/package-info/JavadocOnly.java +4 -0
  15. ljavalang-2.0.1/javalang/test/source/package-info/NoAnnotationNoJavadoc.java +1 -0
  16. ljavalang-2.0.1/javalang/test/test_java_10_11_syntax.py +66 -0
  17. ljavalang-2.0.1/javalang/test/test_java_14_15_syntax.py +195 -0
  18. ljavalang-2.0.1/javalang/test/test_java_16_17_syntax.py +99 -0
  19. ljavalang-2.0.1/javalang/test/test_java_21_syntax.py +140 -0
  20. ljavalang-2.0.1/javalang/test/test_java_8_syntax.py +241 -0
  21. ljavalang-2.0.1/javalang/test/test_java_9_syntax.py +115 -0
  22. ljavalang-2.0.1/javalang/test/test_javadoc.py +14 -0
  23. ljavalang-2.0.1/javalang/test/test_package_declaration.py +61 -0
  24. ljavalang-2.0.1/javalang/test/test_tokenizer.py +192 -0
  25. ljavalang-2.0.1/javalang/test/test_upstream_features.py +120 -0
  26. ljavalang-2.0.1/javalang/test/test_upstream_issues.py +176 -0
  27. ljavalang-2.0.1/javalang/test/test_util.py +69 -0
  28. ljavalang-2.0.1/javalang/tokenizer.py +687 -0
  29. ljavalang-2.0.1/javalang/tree.py +340 -0
  30. ljavalang-2.0.1/javalang/util.py +165 -0
  31. ljavalang-2.0.1/javalang/visitor.py +48 -0
  32. ljavalang-2.0.1/ljavalang.egg-info/PKG-INFO +279 -0
  33. ljavalang-2.0.1/ljavalang.egg-info/SOURCES.txt +35 -0
  34. ljavalang-2.0.1/ljavalang.egg-info/dependency_links.txt +1 -0
  35. ljavalang-2.0.1/ljavalang.egg-info/top_level.txt +1 -0
  36. ljavalang-2.0.1/pyproject.toml +49 -0
  37. ljavalang-2.0.1/setup.cfg +4 -0
@@ -0,0 +1,19 @@
1
+ Copyright (c) 2013 Christopher Thunes
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy
4
+ of this software and associated documentation files (the "Software"), to deal
5
+ in the Software without restriction, including without limitation the rights
6
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
7
+ copies of the Software, and to permit persons to whom the Software is
8
+ furnished to do so, subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in
11
+ all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
15
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
16
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
17
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
18
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
19
+ THE SOFTWARE.
@@ -0,0 +1,279 @@
1
+ Metadata-Version: 2.1
2
+ Name: ljavalang
3
+ Version: 2.0.1
4
+ Summary: Enhanced Java parser with Java 9-22 support, fork of javalang
5
+ Author-email: LoRexxar <lorexxar@gmail.com>
6
+ Maintainer-email: LoRexxar <lorexxar@gmail.com>
7
+ License: MIT
8
+ Project-URL: Homepage, https://github.com/LoRexxar/Ljavalang
9
+ Project-URL: Repository, https://github.com/LoRexxar/Ljavalang
10
+ Project-URL: Issues, https://github.com/LoRexxar/Ljavalang/issues
11
+ Keywords: java,parser,ast,static-analysis,javalang
12
+ Classifier: Development Status :: 5 - Production/Stable
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: License :: OSI Approved :: MIT License
15
+ Classifier: Operating System :: OS Independent
16
+ Classifier: Programming Language :: Python :: 3
17
+ Classifier: Programming Language :: Python :: 3.9
18
+ Classifier: Programming Language :: Python :: 3.10
19
+ Classifier: Programming Language :: Python :: 3.11
20
+ Classifier: Programming Language :: Python :: 3.12
21
+ Classifier: Topic :: Software Development :: Libraries
22
+ Classifier: Topic :: Software Development :: Quality Assurance
23
+ Classifier: Topic :: Software Development :: Compilers
24
+ Requires-Python: >=3.9
25
+ Description-Content-Type: text/markdown
26
+ License-File: LICENSE.txt
27
+
28
+ # Ljavalang
29
+
30
+ > [javalang](https://github.com/c2nes/javalang) 的增强 fork,修复上游 AST 构造缺陷并支持 Java 9-22 新语法,为 [Kunlun-M](https://github.com/LoRexxar/Kunlun-M) 等静态分析工具提供准确的 Java 语法树。
31
+
32
+ [![GitHub Actions](https://github.com/LoRexxar/Ljavalang/actions/workflows/tests.yml/badge.svg?branch=develop)](https://github.com/LoRexxar/Ljavalang/actions/workflows/tests.yml)
33
+
34
+ ## 与上游的区别
35
+
36
+ | 特性 | 上游 javalang | Ljavalang |
37
+ |------|:---:|:---:|
38
+ | Java 8 语法 | ✅ | ✅ |
39
+ | **链式调用修复** | ❌ `a.b().c()` 解析为扁平 selectors | ✅ 正确嵌套为限定符链 |
40
+ | **Java 9** TWR effectively final | ❌ | ✅ |
41
+ | **Java 9** module-info | ❌ | ✅ |
42
+ | **Java 10** var 类型推断 | ❌ | ✅ |
43
+ | **Java 14** switch expression (arrow/yield) | ❌ | ✅ |
44
+ | **Java 14** pattern matching instanceof | ❌ | ✅ |
45
+ | **Java 15** text block (三引号字符串) | ❌ | ✅ |
46
+ | **Java 16** record class | ❌ | ✅ |
47
+ | **Java 17** sealed / permits / non-sealed | ❌ | ✅ |
48
+ | **Java 21** pattern matching switch | ❌ | ✅ |
49
+ | **Java 21** record pattern (解构) | ❌ | ✅ |
50
+ | **Java 22** unnamed variable `_` | ❌ | ✅ |
51
+ | **上游 issue 修复** | 部分未修复 | ✅ 全部 151 issue 已分析,32 bug 已验证 |
52
+ | **Token 位置范围** | ❌ | ✅ `Position.range` |
53
+ | **Visitor 模式** | ❌ | ✅ `javalang.visitor.JavaVisitor` |
54
+ | **Receiver parameter** | ❌ | ✅ Java 8 `Type.this` 参数 |
55
+
56
+ ## 安装
57
+
58
+ ```bash
59
+ pip install git+https://github.com/LoRexxar/Ljavalang.git@develop
60
+ ```
61
+
62
+ 或克隆后本地安装:
63
+
64
+ ```bash
65
+ git clone https://github.com/LoRexxar/Ljavalang.git
66
+ cd Ljavalang
67
+ pip install -e .
68
+ ```
69
+
70
+ ## 快速开始
71
+
72
+ 用法与上游 javalang 完全兼容:
73
+
74
+ ```python
75
+ >>> import javalang
76
+ >>> tree = javalang.parse.parse('package com.example; class Test {}')
77
+ >>> tree.package.name
78
+ 'com.example'
79
+ >>> tree.types[0].name
80
+ 'Test'
81
+ ```
82
+
83
+ ### 新语法示例
84
+
85
+ **Java 14 switch expression:**
86
+ ```python
87
+ >>> code = '''
88
+ ... class T {
89
+ ... int m(int x) {
90
+ ... return switch(x) {
91
+ ... case 1 -> 10;
92
+ ... case 2 -> 20;
93
+ ... default -> 0;
94
+ ... };
95
+ ... }
96
+ ... }'''
97
+ >>> tree = javalang.parse.parse(code)
98
+ >>> # return 语句中的表达式是 SwitchExpression
99
+ >>> tree.types[0].body[0].body[0].expression
100
+ SwitchExpression
101
+ ```
102
+
103
+ **Java 16 record:**
104
+ ```python
105
+ >>> tree = javalang.parse.parse('record Point(int x, int y) {}')
106
+ >>> tree.types[0]
107
+ RecordDeclaration
108
+ >>> tree.types[0].name
109
+ 'Point'
110
+ ```
111
+
112
+ **Java 21 record pattern:**
113
+ ```python
114
+ >>> code = '''
115
+ ... class T {
116
+ ... record Point(int x, int y) {}
117
+ ... void m(Object o) {
118
+ ... switch(o) {
119
+ ... case Point(int x, int y) -> System.out.println(x + y);
120
+ ... default -> {}
121
+ ... }
122
+ ... }
123
+ ... }'''
124
+ >>> javalang.parse.parse(code) # 正常解析
125
+ ```
126
+
127
+ **链式调用(核心 bug 修复):**
128
+ ```python
129
+ >>> code = 'class T { void m(String cmd) { Runtime.getRuntime().exec(cmd); } }'
130
+ >>> tree = javalang.parse.parse(code)
131
+ >>> # 上游会把 exec 错误地放入 selectors 列表
132
+ >>> # Ljavalang 正确解析为嵌套的 MethodInvocation 限定符链
133
+ ```
134
+
135
+ ### Visitor 模式遍历
136
+
137
+ ```python
138
+ from javalang.visitor import JavaVisitor
139
+
140
+ class MethodCollector(JavaVisitor):
141
+ def __init__(self):
142
+ self.methods = []
143
+
144
+ def visit_MethodDeclaration(self, node):
145
+ self.methods.append(node.name)
146
+ self.generic_visit(node)
147
+
148
+ collector = MethodCollector()
149
+ collector.visit(tree)
150
+ print(collector.methods) # ['foo', 'bar', ...]
151
+ ```
152
+
153
+ ### Token 位置范围
154
+
155
+ ```python
156
+ from javalang.tokenizer import tokenize
157
+
158
+ code = 'int x = 42;'
159
+ for token in tokenize(code):
160
+ r = token.position.range
161
+ print(f'{token.value} -> code[{r.start}:{r.stop}] = {code[r]!r}')
162
+ # int -> code[0:3] = 'int'
163
+ # x -> code[4:5] = 'x'
164
+ # = -> code[6:7] = '='
165
+ # 42 -> code[8:10] = '42'
166
+ ```
167
+
168
+ ## 测试
169
+
170
+ ```bash
171
+ # 运行全部测试(112 个用例)
172
+ python -m pytest javalang/test/ -v \
173
+ --ignore=javalang/test/test_java_8_syntax.py \
174
+ --ignore=javalang/test/test_package_declaration.py
175
+
176
+ # 仅运行特定版本的测试
177
+ python -m pytest javalang/test/test_java_21_syntax.py -v
178
+
179
+ # 仅运行上游 issue 回归测试
180
+ python -m pytest javalang/test/test_upstream_issues.py javalang/test/test_upstream_features.py -v
181
+ ```
182
+
183
+ 测试覆盖矩阵:Python 3.9 / 3.10 / 3.11 / 3.12,通过 GitHub Actions 自动运行。
184
+
185
+ ## 支持的 Java 语法特性
186
+
187
+ <details>
188
+ <summary>完整列表(点击展开)</summary>
189
+
190
+ ### Java 8(上游已支持)
191
+ - Lambda 表达式
192
+ - 方法引用
193
+ - 类型注解
194
+ - 接口 default/static 方法
195
+ - 通用 try-with-resources
196
+ - Receiver parameter(`Inner.this` 参数)
197
+
198
+ ### Java 9
199
+ - `try`-with-resources effectively final 变量
200
+ - `module-info.java`(module / open module / requires / exports / opens / uses / provides)
201
+ - 接口 private 方法
202
+ - 匿名类 diamond 操作符
203
+
204
+ ### Java 10-11
205
+ - `var` 局部变量类型推断
206
+ - `var` 在 for-each / try-with-resources 中
207
+ - `var` 在 lambda 参数中
208
+
209
+ ### Java 14
210
+ - Switch expression(`case X ->` 箭头语法)
211
+ - Switch expression 表达式级别(`return switch(...)` / 赋值右值)
212
+ - 多标签 case(`case 1, 2, 3 ->`)
213
+ - `yield` 语句
214
+ - Pattern matching `instanceof`(`obj instanceof String s`)
215
+
216
+ ### Java 15
217
+ - Text block(`"""..."""` 三引号字符串)
218
+
219
+ ### Java 16
220
+ - `record` 类声明
221
+ - 局部 record / enum(方法体内)
222
+ - record 作为类成员
223
+
224
+ ### Java 17
225
+ - `sealed` class / interface
226
+ - `permits` 子句
227
+ - `non-sealed` 修饰符
228
+
229
+ ### Java 21
230
+ - Pattern matching switch(`case String s ->`)
231
+ - Record pattern 解构(`case Point(int x, int y) ->`)
232
+ - 嵌套 record pattern
233
+ - `case null` 匹配
234
+
235
+ ### Java 22
236
+ - Unnamed variable `_`
237
+ - Unnamed lambda 参数
238
+
239
+ ### 上游 Bug 修复(32 项)
240
+ - **链式调用**:`a.b().c()` 不再被错误地放入 `selectors`,而是正确嵌套为限定符链
241
+ - **DecimalInteger 继承**:继承 `Integer` 而非跳级 `Literal`
242
+ - **Character token**:char 字面量 `'a'` 生成 `Character` 类型而非 `String`
243
+ - **泛型内注解**:`List<@NotNull String>` 正确解析
244
+ - **void 返回类型**:`return_type` 为 `'void'` 而非 `None`
245
+ - **prefix/postfix 保留**:括号内一元运算符不再丢失
246
+
247
+ </details>
248
+
249
+ ## 项目结构
250
+
251
+ ```
252
+ javalang/
253
+ ├── parse.py # 入口:parse() / parse_expression() 等
254
+ ├── parser.py # 递归下降解析器(~2800 行)
255
+ ├── tokenizer.py # 词法分析器(~700 行)
256
+ ├── tree.py # AST 节点定义(~340 行)
257
+ ├── visitor.py # Visitor 模式遍历
258
+ ├── test/ # 测试用例(112 个)
259
+ │ ├── test_java_9_syntax.py
260
+ │ ├── test_java_10_11_syntax.py
261
+ │ ├── test_java_14_15_syntax.py
262
+ │ ├── test_java_16_17_syntax.py
263
+ │ ├── test_java_21_syntax.py
264
+ │ ├── test_upstream_issues.py # 上游 bug 回归测试
265
+ │ └── test_upstream_features.py # 上游 feature 测试
266
+ └── docs/
267
+ ├── architecture.md # 架构文档
268
+ ├── java-version-roadmap.md # 版本支持路线图
269
+ ├── upstream-issues.md # 151 个上游 issue 分类
270
+ └── issue-fix-progress.md # 修复进度追踪
271
+ ```
272
+
273
+ ## 致谢
274
+
275
+ 基于 [c2nes/javalang](https://github.com/c2nes/javalang)(作者 Chris Thunes)开发,为 [Kunlun-M](https://github.com/LoRexxar/Kunlun-M) 白盒扫描器提供 Java 解析支持。
276
+
277
+ ## License
278
+
279
+ MIT License(继承自上游)
@@ -0,0 +1,252 @@
1
+ # Ljavalang
2
+
3
+ > [javalang](https://github.com/c2nes/javalang) 的增强 fork,修复上游 AST 构造缺陷并支持 Java 9-22 新语法,为 [Kunlun-M](https://github.com/LoRexxar/Kunlun-M) 等静态分析工具提供准确的 Java 语法树。
4
+
5
+ [![GitHub Actions](https://github.com/LoRexxar/Ljavalang/actions/workflows/tests.yml/badge.svg?branch=develop)](https://github.com/LoRexxar/Ljavalang/actions/workflows/tests.yml)
6
+
7
+ ## 与上游的区别
8
+
9
+ | 特性 | 上游 javalang | Ljavalang |
10
+ |------|:---:|:---:|
11
+ | Java 8 语法 | ✅ | ✅ |
12
+ | **链式调用修复** | ❌ `a.b().c()` 解析为扁平 selectors | ✅ 正确嵌套为限定符链 |
13
+ | **Java 9** TWR effectively final | ❌ | ✅ |
14
+ | **Java 9** module-info | ❌ | ✅ |
15
+ | **Java 10** var 类型推断 | ❌ | ✅ |
16
+ | **Java 14** switch expression (arrow/yield) | ❌ | ✅ |
17
+ | **Java 14** pattern matching instanceof | ❌ | ✅ |
18
+ | **Java 15** text block (三引号字符串) | ❌ | ✅ |
19
+ | **Java 16** record class | ❌ | ✅ |
20
+ | **Java 17** sealed / permits / non-sealed | ❌ | ✅ |
21
+ | **Java 21** pattern matching switch | ❌ | ✅ |
22
+ | **Java 21** record pattern (解构) | ❌ | ✅ |
23
+ | **Java 22** unnamed variable `_` | ❌ | ✅ |
24
+ | **上游 issue 修复** | 部分未修复 | ✅ 全部 151 issue 已分析,32 bug 已验证 |
25
+ | **Token 位置范围** | ❌ | ✅ `Position.range` |
26
+ | **Visitor 模式** | ❌ | ✅ `javalang.visitor.JavaVisitor` |
27
+ | **Receiver parameter** | ❌ | ✅ Java 8 `Type.this` 参数 |
28
+
29
+ ## 安装
30
+
31
+ ```bash
32
+ pip install git+https://github.com/LoRexxar/Ljavalang.git@develop
33
+ ```
34
+
35
+ 或克隆后本地安装:
36
+
37
+ ```bash
38
+ git clone https://github.com/LoRexxar/Ljavalang.git
39
+ cd Ljavalang
40
+ pip install -e .
41
+ ```
42
+
43
+ ## 快速开始
44
+
45
+ 用法与上游 javalang 完全兼容:
46
+
47
+ ```python
48
+ >>> import javalang
49
+ >>> tree = javalang.parse.parse('package com.example; class Test {}')
50
+ >>> tree.package.name
51
+ 'com.example'
52
+ >>> tree.types[0].name
53
+ 'Test'
54
+ ```
55
+
56
+ ### 新语法示例
57
+
58
+ **Java 14 switch expression:**
59
+ ```python
60
+ >>> code = '''
61
+ ... class T {
62
+ ... int m(int x) {
63
+ ... return switch(x) {
64
+ ... case 1 -> 10;
65
+ ... case 2 -> 20;
66
+ ... default -> 0;
67
+ ... };
68
+ ... }
69
+ ... }'''
70
+ >>> tree = javalang.parse.parse(code)
71
+ >>> # return 语句中的表达式是 SwitchExpression
72
+ >>> tree.types[0].body[0].body[0].expression
73
+ SwitchExpression
74
+ ```
75
+
76
+ **Java 16 record:**
77
+ ```python
78
+ >>> tree = javalang.parse.parse('record Point(int x, int y) {}')
79
+ >>> tree.types[0]
80
+ RecordDeclaration
81
+ >>> tree.types[0].name
82
+ 'Point'
83
+ ```
84
+
85
+ **Java 21 record pattern:**
86
+ ```python
87
+ >>> code = '''
88
+ ... class T {
89
+ ... record Point(int x, int y) {}
90
+ ... void m(Object o) {
91
+ ... switch(o) {
92
+ ... case Point(int x, int y) -> System.out.println(x + y);
93
+ ... default -> {}
94
+ ... }
95
+ ... }
96
+ ... }'''
97
+ >>> javalang.parse.parse(code) # 正常解析
98
+ ```
99
+
100
+ **链式调用(核心 bug 修复):**
101
+ ```python
102
+ >>> code = 'class T { void m(String cmd) { Runtime.getRuntime().exec(cmd); } }'
103
+ >>> tree = javalang.parse.parse(code)
104
+ >>> # 上游会把 exec 错误地放入 selectors 列表
105
+ >>> # Ljavalang 正确解析为嵌套的 MethodInvocation 限定符链
106
+ ```
107
+
108
+ ### Visitor 模式遍历
109
+
110
+ ```python
111
+ from javalang.visitor import JavaVisitor
112
+
113
+ class MethodCollector(JavaVisitor):
114
+ def __init__(self):
115
+ self.methods = []
116
+
117
+ def visit_MethodDeclaration(self, node):
118
+ self.methods.append(node.name)
119
+ self.generic_visit(node)
120
+
121
+ collector = MethodCollector()
122
+ collector.visit(tree)
123
+ print(collector.methods) # ['foo', 'bar', ...]
124
+ ```
125
+
126
+ ### Token 位置范围
127
+
128
+ ```python
129
+ from javalang.tokenizer import tokenize
130
+
131
+ code = 'int x = 42;'
132
+ for token in tokenize(code):
133
+ r = token.position.range
134
+ print(f'{token.value} -> code[{r.start}:{r.stop}] = {code[r]!r}')
135
+ # int -> code[0:3] = 'int'
136
+ # x -> code[4:5] = 'x'
137
+ # = -> code[6:7] = '='
138
+ # 42 -> code[8:10] = '42'
139
+ ```
140
+
141
+ ## 测试
142
+
143
+ ```bash
144
+ # 运行全部测试(112 个用例)
145
+ python -m pytest javalang/test/ -v \
146
+ --ignore=javalang/test/test_java_8_syntax.py \
147
+ --ignore=javalang/test/test_package_declaration.py
148
+
149
+ # 仅运行特定版本的测试
150
+ python -m pytest javalang/test/test_java_21_syntax.py -v
151
+
152
+ # 仅运行上游 issue 回归测试
153
+ python -m pytest javalang/test/test_upstream_issues.py javalang/test/test_upstream_features.py -v
154
+ ```
155
+
156
+ 测试覆盖矩阵:Python 3.9 / 3.10 / 3.11 / 3.12,通过 GitHub Actions 自动运行。
157
+
158
+ ## 支持的 Java 语法特性
159
+
160
+ <details>
161
+ <summary>完整列表(点击展开)</summary>
162
+
163
+ ### Java 8(上游已支持)
164
+ - Lambda 表达式
165
+ - 方法引用
166
+ - 类型注解
167
+ - 接口 default/static 方法
168
+ - 通用 try-with-resources
169
+ - Receiver parameter(`Inner.this` 参数)
170
+
171
+ ### Java 9
172
+ - `try`-with-resources effectively final 变量
173
+ - `module-info.java`(module / open module / requires / exports / opens / uses / provides)
174
+ - 接口 private 方法
175
+ - 匿名类 diamond 操作符
176
+
177
+ ### Java 10-11
178
+ - `var` 局部变量类型推断
179
+ - `var` 在 for-each / try-with-resources 中
180
+ - `var` 在 lambda 参数中
181
+
182
+ ### Java 14
183
+ - Switch expression(`case X ->` 箭头语法)
184
+ - Switch expression 表达式级别(`return switch(...)` / 赋值右值)
185
+ - 多标签 case(`case 1, 2, 3 ->`)
186
+ - `yield` 语句
187
+ - Pattern matching `instanceof`(`obj instanceof String s`)
188
+
189
+ ### Java 15
190
+ - Text block(`"""..."""` 三引号字符串)
191
+
192
+ ### Java 16
193
+ - `record` 类声明
194
+ - 局部 record / enum(方法体内)
195
+ - record 作为类成员
196
+
197
+ ### Java 17
198
+ - `sealed` class / interface
199
+ - `permits` 子句
200
+ - `non-sealed` 修饰符
201
+
202
+ ### Java 21
203
+ - Pattern matching switch(`case String s ->`)
204
+ - Record pattern 解构(`case Point(int x, int y) ->`)
205
+ - 嵌套 record pattern
206
+ - `case null` 匹配
207
+
208
+ ### Java 22
209
+ - Unnamed variable `_`
210
+ - Unnamed lambda 参数
211
+
212
+ ### 上游 Bug 修复(32 项)
213
+ - **链式调用**:`a.b().c()` 不再被错误地放入 `selectors`,而是正确嵌套为限定符链
214
+ - **DecimalInteger 继承**:继承 `Integer` 而非跳级 `Literal`
215
+ - **Character token**:char 字面量 `'a'` 生成 `Character` 类型而非 `String`
216
+ - **泛型内注解**:`List<@NotNull String>` 正确解析
217
+ - **void 返回类型**:`return_type` 为 `'void'` 而非 `None`
218
+ - **prefix/postfix 保留**:括号内一元运算符不再丢失
219
+
220
+ </details>
221
+
222
+ ## 项目结构
223
+
224
+ ```
225
+ javalang/
226
+ ├── parse.py # 入口:parse() / parse_expression() 等
227
+ ├── parser.py # 递归下降解析器(~2800 行)
228
+ ├── tokenizer.py # 词法分析器(~700 行)
229
+ ├── tree.py # AST 节点定义(~340 行)
230
+ ├── visitor.py # Visitor 模式遍历
231
+ ├── test/ # 测试用例(112 个)
232
+ │ ├── test_java_9_syntax.py
233
+ │ ├── test_java_10_11_syntax.py
234
+ │ ├── test_java_14_15_syntax.py
235
+ │ ├── test_java_16_17_syntax.py
236
+ │ ├── test_java_21_syntax.py
237
+ │ ├── test_upstream_issues.py # 上游 bug 回归测试
238
+ │ └── test_upstream_features.py # 上游 feature 测试
239
+ └── docs/
240
+ ├── architecture.md # 架构文档
241
+ ├── java-version-roadmap.md # 版本支持路线图
242
+ ├── upstream-issues.md # 151 个上游 issue 分类
243
+ └── issue-fix-progress.md # 修复进度追踪
244
+ ```
245
+
246
+ ## 致谢
247
+
248
+ 基于 [c2nes/javalang](https://github.com/c2nes/javalang)(作者 Chris Thunes)开发,为 [Kunlun-M](https://github.com/LoRexxar/Kunlun-M) 白盒扫描器提供 Java 解析支持。
249
+
250
+ ## License
251
+
252
+ MIT License(继承自上游)
@@ -0,0 +1,158 @@
1
+
2
+ ========
3
+ javalang
4
+ ========
5
+
6
+ .. image:: https://travis-ci.org/c2nes/javalang.svg?branch=master
7
+ :target: https://travis-ci.org/c2nes/javalang
8
+
9
+ .. image:: https://badge.fury.io/py/javalang.svg
10
+ :target: https://badge.fury.io/py/javalang
11
+
12
+ javalang is a pure Python library for working with Java source
13
+ code. javalang provides a lexer and parser targeting Java 8. The
14
+ implementation is based on the Java language spec available at
15
+ http://docs.oracle.com/javase/specs/jls/se8/html/.
16
+
17
+ The following gives a very brief introduction to using javalang.
18
+
19
+ ---------------
20
+ Getting Started
21
+ ---------------
22
+
23
+ .. code-block:: python
24
+
25
+ >>> import javalang
26
+ >>> tree = javalang.parse.parse("package javalang.brewtab.com; class Test {}")
27
+
28
+ This will return a ``CompilationUnit`` instance. This object is the root of a
29
+ tree which may be traversed to extract different information about the
30
+ compilation unit,
31
+
32
+ .. code-block:: python
33
+
34
+ >>> tree.package.name
35
+ u'javalang.brewtab.com'
36
+ >>> tree.types[0]
37
+ ClassDeclaration
38
+ >>> tree.types[0].name
39
+ u'Test'
40
+
41
+ The string passed to ``javalang.parse.parse()`` must represent a complete unit
42
+ which simply means it should represent a complete, valid Java source file. Other
43
+ methods in the ``javalang.parse`` module allow for some smaller code snippets to
44
+ be parsed without providing an entire compilation unit.
45
+
46
+ Working with the syntax tree
47
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
48
+
49
+ ``CompilationUnit`` is a subclass of ``javalang.ast.Node``, as are its
50
+ descendants in the tree. The ``javalang.tree`` module defines the different
51
+ types of ``Node`` subclasses, each of which represent the different syntaxual
52
+ elements you will find in Java code. For more detail on what node types are
53
+ available, see the ``javalang/tree.py`` source file until the documentation is
54
+ complete.
55
+
56
+ ``Node`` instances support iteration,
57
+
58
+ .. code-block:: python
59
+
60
+ >>> for path, node in tree:
61
+ ... print path, node
62
+ ...
63
+ () CompilationUnit
64
+ (CompilationUnit,) PackageDeclaration
65
+ (CompilationUnit, [ClassDeclaration]) ClassDeclaration
66
+
67
+ This iteration can also be filtered by type,
68
+
69
+ .. code-block:: python
70
+
71
+ >>> for path, node in tree.filter(javalang.tree.ClassDeclaration):
72
+ ... print path, node
73
+ ...
74
+ (CompilationUnit, [ClassDeclaration]) ClassDeclaration
75
+
76
+ ---------------
77
+ Component Usage
78
+ ---------------
79
+
80
+ Internally, the ``javalang.parse.parse`` method is a simple method which creates
81
+ a token stream for the input, initializes a new ``javalang.parser.Parser``
82
+ instance with the given token stream, and then invokes the parser's ``parse()``
83
+ method, returning the resulting ``CompilationUnit``. These components may be
84
+ also be used individually.
85
+
86
+ Tokenizer
87
+ ^^^^^^^^^
88
+
89
+ The tokenizer/lexer may be invoked directly be calling ``javalang.tokenizer.tokenize``,
90
+
91
+ .. code-block:: python
92
+
93
+ >>> javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
94
+ <generator object tokenize at 0x1ce5190>
95
+
96
+ This returns a generator which provides a stream of ``JavaToken`` objects. Each
97
+ token carries position (line, column) and value information,
98
+
99
+ .. code-block:: python
100
+
101
+ >>> tokens = list(javalang.tokenizer.tokenize('System.out.println("Hello " + "world");'))
102
+ >>> tokens[6].value
103
+ u'"Hello "'
104
+ >>> tokens[6].position
105
+ (1, 19)
106
+
107
+ The tokens are not directly instances of ``JavaToken``, but are instead
108
+ instances of subclasses which identify their general type,
109
+
110
+ .. code-block:: python
111
+
112
+ >>> type(tokens[6])
113
+ <class 'javalang.tokenizer.String'>
114
+ >>> type(tokens[7])
115
+ <class 'javalang.tokenizer.Operator'>
116
+
117
+
118
+ **NOTE:** The shift operators ``>>`` and ``>>>`` are represented by multiple
119
+ ``>`` tokens. This is because multiple ``>`` may appear in a row when closing
120
+ nested generic parameter/arguments lists. This abiguity is instead resolved by
121
+ the parser.
122
+
123
+ Parser
124
+ ^^^^^^
125
+
126
+ To parse snippets of code, a parser may be used directly,
127
+
128
+ .. code-block:: python
129
+
130
+ >>> tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
131
+ >>> parser = javalang.parser.Parser(tokens)
132
+ >>> parser.parse_expression()
133
+ MethodInvocation
134
+
135
+ The parse methods are designed for incremental parsing so they will not restart
136
+ at the beginning of the token stream. Attempting to call a parse method more
137
+ than once will result in a ``JavaSyntaxError`` exception.
138
+
139
+ Invoking the incorrect parse method will also result in a ``JavaSyntaxError``
140
+ exception,
141
+
142
+ .. code-block:: python
143
+
144
+ >>> tokens = javalang.tokenizer.tokenize('System.out.println("Hello " + "world");')
145
+ >>> parser = javalang.parser.Parser(tokens)
146
+ >>> parser.parse_type_declaration()
147
+ Traceback (most recent call last):
148
+ File "<stdin>", line 1, in <module>
149
+ File "javalang/parser.py", line 336, in parse_type_declaration
150
+ return self.parse_class_or_interface_declaration()
151
+ File "javalang/parser.py", line 353, in parse_class_or_interface_declaration
152
+ self.illegal("Expected type declaration")
153
+ File "javalang/parser.py", line 122, in illegal
154
+ raise JavaSyntaxError(description, at)
155
+ javalang.parser.JavaSyntaxError
156
+
157
+ The ``javalang.parse`` module also provides convenience methods for parsing more
158
+ common types of code snippets.