aliyun-odps 0.1.0 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (58) hide show
  1. checksums.yaml +4 -4
  2. data/.gitignore +3 -0
  3. data/.rubocop.yml +31 -0
  4. data/Gemfile +3 -0
  5. data/README.md +55 -12
  6. data/Rakefile +15 -5
  7. data/aliyun-odps.gemspec +22 -11
  8. data/bin/console +10 -3
  9. data/lib/aliyun/odps.rb +69 -2
  10. data/lib/aliyun/odps/authorization.rb +90 -0
  11. data/lib/aliyun/odps/client.rb +40 -0
  12. data/lib/aliyun/odps/configuration.rb +16 -0
  13. data/lib/aliyun/odps/error.rb +97 -0
  14. data/lib/aliyun/odps/http.rb +138 -0
  15. data/lib/aliyun/odps/list.rb +40 -0
  16. data/lib/aliyun/odps/model/function.rb +16 -0
  17. data/lib/aliyun/odps/model/functions.rb +113 -0
  18. data/lib/aliyun/odps/model/instance.rb +130 -0
  19. data/lib/aliyun/odps/model/instance_task.rb +30 -0
  20. data/lib/aliyun/odps/model/instances.rb +119 -0
  21. data/lib/aliyun/odps/model/projects.rb +73 -0
  22. data/lib/aliyun/odps/model/resource.rb +26 -0
  23. data/lib/aliyun/odps/model/resources.rb +144 -0
  24. data/lib/aliyun/odps/model/table.rb +37 -0
  25. data/lib/aliyun/odps/model/table_column.rb +13 -0
  26. data/lib/aliyun/odps/model/table_partition.rb +9 -0
  27. data/lib/aliyun/odps/model/table_partitions.rb +90 -0
  28. data/lib/aliyun/odps/model/table_schema.rb +13 -0
  29. data/lib/aliyun/odps/model/tables.rb +125 -0
  30. data/lib/aliyun/odps/model/task_result.rb +9 -0
  31. data/lib/aliyun/odps/modelable.rb +16 -0
  32. data/lib/aliyun/odps/project.rb +47 -0
  33. data/lib/aliyun/odps/service_object.rb +27 -0
  34. data/lib/aliyun/odps/struct.rb +126 -0
  35. data/lib/aliyun/odps/tunnel/download_session.rb +98 -0
  36. data/lib/aliyun/odps/tunnel/router.rb +15 -0
  37. data/lib/aliyun/odps/tunnel/snappy_reader.rb +19 -0
  38. data/lib/aliyun/odps/tunnel/snappy_writer.rb +45 -0
  39. data/lib/aliyun/odps/tunnel/table_tunnels.rb +81 -0
  40. data/lib/aliyun/odps/tunnel/upload_block.rb +9 -0
  41. data/lib/aliyun/odps/tunnel/upload_session.rb +132 -0
  42. data/lib/aliyun/odps/utils.rb +102 -0
  43. data/lib/aliyun/odps/version.rb +1 -1
  44. data/requirements.png +0 -0
  45. data/wiki/error.md +188 -0
  46. data/wiki/functions.md +39 -0
  47. data/wiki/get_start.md +34 -0
  48. data/wiki/installation.md +15 -0
  49. data/wiki/instances.md +32 -0
  50. data/wiki/projects.md +51 -0
  51. data/wiki/resources.md +62 -0
  52. data/wiki/ssl.md +7 -0
  53. data/wiki/tables.md +75 -0
  54. data/wiki/tunnels.md +80 -0
  55. metadata +195 -13
  56. data/requirements.mindnode/QuickLook/Preview.jpg +0 -0
  57. data/requirements.mindnode/contents.xml +0 -10711
  58. data/requirements.mindnode/viewState.plist +0 -0
@@ -1,5 +1,5 @@
1
1
  module Aliyun
2
2
  module Odps
3
- VERSION = "0.1.0"
3
+ VERSION = '0.4.0'
4
4
  end
5
5
  end
data/requirements.png ADDED
Binary file
data/wiki/error.md ADDED
@@ -0,0 +1,188 @@
1
+ ## ERROR Code
2
+
3
+ In our library, `Aliyun::Odps::RequestError` is raised when Request fail, ***keep your code wrap with begin...rescue***. Below is a example.
4
+
5
+
6
+ begin
7
+ projects = Aliyun::Odps.list_projects
8
+ puts projects.map(&:name)
9
+ rescue Aliyun::Odps::RequestError => e
10
+ puts "Code: #{e.code}"
11
+ puts "Message: #{e.message}"
12
+ puts "Request ID: #{e.request_id}"
13
+ end
14
+
15
+ ODPS define many error code, you can find more detail in this section.
16
+
17
+ If your need help for your problem, search it via Google or Aliyun forum. If you still has trouble after that, contact ODPS Developer and provide your error code and request id, they are nice and rapid response.
18
+
19
+
20
+ Below is the ODPS ERROR Code:
21
+
22
+ ### ODPS Common ERROR Code
23
+
24
+ | CODE | Message | HTTP STATUS CODE |
25
+ |------|---------|------------------|
26
+ |ServiceUnavailable |Service is temporarily unavailable, Please try again later. | 503 Service Unavailable|
27
+ |InternalServerError| Service internal error, please try again later.| 500 Internal Server Error|
28
+ |MethodNotAllowed| Unsupported request method.| 405 Method Not Allowed|
29
+ |MissingAuthorization| Authorization is required.| 401 Unauthorized|
30
+ |InvalidAuthorization| Invalid authorization.|401 Unauthorized|
31
+ |MissingContentType |Content-type is required.| 415 Unsupported Media Type|
32
+ |InvalidContentType |Invalid content type. | 415 Unsupported Media Type|
33
+ |MissingContentLength| Content-length is required.| 411 Length Required|
34
+ |ObjectAlreadyExists| The "" has already exited.|409 Conflict|
35
+ |NoSuchObject| Object not found-''.|404 Not found|
36
+ |NoPermission| You haven't enough privilege to XX.|403 Forbidden|
37
+ |InvalidParameter| Invalid parameter XX|400 Bad Request|
38
+ |InvalidStatusSetting| Current status is . It is not allowed to set status to|409 Conflict|
39
+ |DeleteConflict|Cannot delete object which is using|409 Conflict|
40
+ |AccessDenied| Access Denied|403 Forbidden|
41
+
42
+
43
+ ### SQL ERROR Code
44
+
45
+ In General, ODPS return ERROR Code with ODPS-MMCCCCX format.
46
+
47
+ Below is a table list all SQL ERROR Code:
48
+
49
+ |Code |Severity Level |ERROR Message |
50
+ |-----|---------------|--------------|
51
+ |ODPS-0110005| 5 | Unknown exception from metadata operation|
52
+ |ODPS-0110011| 1 |Authorization exception|
53
+ |ODPS-0110021| 1 | Invalid parameters|
54
+ |ODPS-0110031| 1 | Invalid object type|
55
+ |ODPS-0110041| 1 | Invalid meta operation|
56
+ |ODPS-0110061| 1 | Failed to run ddltask|
57
+ |ODPS-0110071| 1 | OTS initialization exception|
58
+ |ODPS-0110081| 1 | OTS transaction exception|
59
+ |ODPS-0110091| 1 | OTS filtering exception|
60
+ |ODPS-0110101| 1 |OTS processing exception|
61
+ |ODPS-0110111| 1 |OTS invalid data object|
62
+ |ODPS-0110121| 1 |Unknown OTS exception|
63
+ |ODPS-0110131| 1 | StorageDescriptor compression exception|
64
+ |ODPS-0110141| 1 |Data version exception|
65
+ |ODPS-0110999| 9 |Critical! Internal error happened in commit |operation and rollback failed, possible breach of atomicity|
66
+ |ODPS-0120005| 5 | Unknown exception from processor|
67
+ |ODPS-0120011| 1 |Authorization exception|
68
+ |ODPS-0120021| 1 |the delimitor must be the same in wm_concat|
69
+ |ODPS-0120031| 1 |Instance has been cancelled|
70
+ |ODPS-0121011| 1 | Invalid regular expression pattern|
71
+ |ODPS-0121021| 1 | Regexec call failed|
72
+ |ODPS-0121035| 5 |Illegal implicit type cast|
73
+ |ODPS-0121045| 5 | Unsupported return type|
74
+ |ODPS-0121055| 5 | Empty argument value|
75
+ |ODPS-0121065| 5 |Argument value out of range|
76
+ |ODPS-0121075| 5 |Invalid number of arguments|
77
+ |ODPS-0121081| 1 |Illegal argument type|
78
+ |ODPS-0121095| 5 |Invalid arguments|
79
+ |ODPS-0121105| 5 |Constant argument value expected|
80
+ |ODPS-0121115| 5 |Column reference expected|
81
+ |ODPS-0121125| 5 |Unsupported function or operation|
82
+ |ODPS-0121135| 5 |Malloc memory failed|
83
+ |ODPS-0121145| 5 | Data overflow|
84
+ |ODPS-0123019| 9 |Distributed file operation exception|
85
+ |ODPS-0123023| 3 |Unsupported reduce type|
86
+ |ODPS-0123031| 1 |Partition exception|
87
+ |ODPS-0123043| 3 |buffer overflow|
88
+ |ODPS-0123055| 5 |Script exception|
89
+ |ODPS-0123065| 5 |Join exception|
90
+ |ODPS-0123075| 5 |Hash exception|
91
+ |ODPS-0123081| 1 |Invalid datetime string|
92
+ |ODPS-0123091| 1 |Illegal type cast|
93
+ |ODPS-0123105| 5 |Job got killed|
94
+ |ODPS-0123111| 1 |format string does not match datetime string|
95
+ |ODPS-0123121| 1 |Mapjoin exception|
96
+ |ODPS-0123131| 1 |User defined function exception|
97
+ |ODPS-0130005| 5 |Unknown exception from parser|
98
+ |ODPS-0130013| 3 | Authorization exception|
99
+ |ODPS-0130025| 5 |Failed to I/O|
100
+ |ODPS-0130031| 1 |Failed to drop table|
101
+ |ODPS-0130041| 1 |Statistics exception|
102
+ |ODPS-0130051| 1 |Exception in sub query|
103
+ |ODPS-0130061| 1 |Invalid table|
104
+ |ODPS-0130071| 1 |Semantic analysis exception|
105
+ |ODPS-0130081| 1 |Invalid UDF reference|
106
+ |ODPS-0130091| 1 |Invalid parameters|
107
+ |ODPS-0130101| 1 | Ambiguous data type|
108
+ |ODPS-0130111| 1 |Subquery partition pruning exception|
109
+ |ODPS-0130121| 1 |Invalid argument type|
110
+ |ODPS-0130131| 1 |Table not found|
111
+ |ODPS-0130141| 1 |Illegal implicit type cast|
112
+ |ODPS-0130151| 1 |Illegal data type|
113
+ |ODPS-0130161| 1 |Parse exception|
114
+ |ODPS-0130171| 1 |Creating view exception|
115
+ |ODPS-0130181| 1 |Window function exception|
116
+ |ODPS-0130191| 1 |Invalid column or partition key|
117
+ |ODPS-0130201| 1 |View not found|
118
+ |ODPS-0130211| 1 |Table or view already exists|
119
+ |ODPS-0130221| 1 |Invalid number of arguments|
120
+ |ODPS-0130231| 1 |Invalid view|
121
+ |ODPS-0130241| 1 |Illegal union operation|
122
+ |ODPS-0130252| 2 |Cartesian product is not allowed|
123
+ |ODPS-0130261| 1 |Invalid schema|
124
+ |ODPS-0130271| 1 |Partition does not exist|
125
+ |ODPS-0140005| 5 | Unknown exception from planner|
126
+ |ODPS-0140011| 1 |Illegal type cast|
127
+ |ODPS-0140021| 1 |Illegal implicit type cast|
128
+ |ODPS-0140031| 1 |Invalid column reference|
129
+ |ODPS-0140041| 1 |Invalid UDF reference|
130
+ |ODPS-0140051| 1 |Invalid function|
131
+ |ODPS-0140061| 1 |Invalid parameters|
132
+ |ODPS-0140071| 1 | Unsupported operator|
133
+ |ODPS-0140081| 1 |Unsupported join type|
134
+ |ODPS-0140091| 1 | Unsupported stage type|
135
+ |ODPS-0140105| 5 |Invalid multiple I/O|
136
+ |ODPS-0140133| 3 |Invalid structure|
137
+ |ODPS-0140151| 1 |Can not do topologic sort, the stages is not a DAG|
138
+ |ODPS-0140171| 1 |Sandbox violation exception|
139
+ |ODPS-0140181| 1 |Sql plan exception|
140
+
141
+ ### ODPS Tunnel ERROR Code
142
+
143
+ Since now, the Tunnel has Different ERROR Code, Below is the List:
144
+
145
+ |Code |Message |
146
+ |-----|--------|
147
+ |AccessDenied| Access Denied|
148
+ |CorruptedDataStream |The data stream was corrupted, please try again later |
149
+ |DataUnderReplication |The specified table data is under replication and you cannot initiate upload or download at this time. Please try again later|
150
+ |DataVersionConflict |The specified table has been modified since the upload or download initiated and table data is being replicated at this time. Please initiate another download or upload later|
151
+ |FlowExceeded |Your flow quota is exceeded |
152
+ |InConsistentBlockList |The specified block list is not consistent with the uploaded block list on server side |
153
+ |IncompleteBody| You did not provide the number of bytes specified by the Content-Length HTTP header |
154
+ |InternalServerError |Service internal error, please try again later|
155
+ |InvalidArgument |Invalid argument |
156
+ |InvalidBlockID |The specified block id is not valid|
157
+ |InvalidColumnSpec |The specified columns is not valid|
158
+ |InvalidRowRange |The specified row range is not valid|
159
+ |InvalidStatusChange| You cannot change the specified upload or download status |
160
+ |InvalidURI |Couldn’t parse the specified URI |
161
+ |InvalidUriSpec |The specified uri spec is not valid|
162
+ |MalformedDataStream |The data stream you provided was not well-formed or did not validate against schema |
163
+ |MalformedHeaderValue |An HTTP header value was malformed|
164
+ |MalformedXML |The XML you provided was not well-formed or did not validate against schema |
165
+ |MaxMessageLengthExceeded |Your request was too big |
166
+ |MethodNotAllowed |The specified method is not allowed against this resource |
167
+ |MissingContentLength |You must provide the Content-Length HTTP header|
168
+ |MissingPartitionSpec |You need to specify a partitionspec along with the specified table |
169
+ |MissingRequestBodyError |The request body is missing|
170
+ |MissingRequiredHeaderError Your |request was missing a required header|
171
+ |NoPermission |You do not have enough privilege to complete the specified operation |
172
+ |NoSuchData |The uploaded data within this uploaded no longer exists|
173
+ |NoSuchDownload |The specified download id does not exist|
174
+ |NoSuchPartition |The specified partition does not exist|
175
+ |NoSuchProject |The specified project name does not exist|
176
+ |NoSuchTable |The specified table name does not exist|
177
+ |NoSuchUpload |The specified upload id does not exist|
178
+ |NoSuchVolume |The specified volume name does not exist|
179
+ |NoSuchVolumeFile |The specified volume file does not exist|
180
+ |NoSuchVolumePartition| The specified volume partition does not exist|
181
+ |NotImplemented |A header you provided implies functionality that is not implemented |
182
+ |ObjectModified |The specified object has been modified since the specified timestamp |
183
+ |RequestTimeOut |Your socket connection to the server was not read from or written to within the timeout period |
184
+ |ServiceUnavailable |Service is temporarily unavailable, Please try again later |
185
+ |StatusConflict |You cannot complete the specified operation under the current upload or download status |
186
+ |TableModified |The specified table has been modified since the download initiated. Try initiate another download |
187
+ |Unauthorized |The request authorization header is invalid or missing|
188
+ |UnexpectedContent |This request does not support content|
data/wiki/functions.md ADDED
@@ -0,0 +1,39 @@
1
+ ## Function
2
+
3
+ We can define functions in odps, which can be used in ODPS SQL.
4
+
5
+ Function Notes:
6
+
7
+ + create a function need name, type, resource and other resource.
8
+ + function name should unique in project.
9
+ + after create, function can be used in SQL.
10
+
11
+
12
+ ### Basic
13
+
14
+ To list all functions:
15
+
16
+ project.functions.list
17
+
18
+ To get function:
19
+
20
+ project.functions.get('function_name')
21
+
22
+
23
+ ### Create Function
24
+
25
+ To create a function, should create resource first:
26
+
27
+
28
+ resource1 = project.resources.get('resource1')
29
+ resource2 = project.resources.get('resource1')
30
+
31
+ function = project.functions.create('function1', 'Path/to/Class', [resource1, resource2])
32
+
33
+
34
+ To delete it:
35
+
36
+ project.functions.delete('function1')
37
+
38
+
39
+ Now, already complete, let's visit the most powerful part [Instances](./instances.md)
data/wiki/get_start.md ADDED
@@ -0,0 +1,34 @@
1
+ ## Get Started
2
+
3
+ First of all, config your environment.
4
+
5
+ If you use Rails, you can place it in `config/initializers/aliyun_odps.rb`, for other framework, just place it before your other odps code.
6
+
7
+ Aliyun::Odps.configure do |config|
8
+ config.access_key = '<your-access-key>'
9
+ config.secret_key = '<your-secret-key>'
10
+ config.endpoint = '<odps-server-api>' # "http://service.odps.aliyun.com/api"
11
+ config.project = '<your-default-project>'
12
+ end
13
+
14
+ After that, you can get your project and start party now!
15
+
16
+ project = Aliyun::Odps.project
17
+
18
+ # Accesss tables
19
+ project.tables.list
20
+
21
+ # Access resources
22
+ project.resources.list
23
+
24
+ # Access instances
25
+ project.instances.list
26
+
27
+ # Access functions
28
+ project.functions.list
29
+
30
+
31
+ # For Tunnel
32
+ project.table_tunnels
33
+
34
+ Next, Let's visit [Project](./projects.md)
@@ -0,0 +1,15 @@
1
+ ## Installation
2
+
3
+ It's a Ruby Gem, so you can install it like any Gem:
4
+
5
+ $ gem install aliyun-odps
6
+
7
+ If you use Gemfile manage your Gems, Add below to your Gemfile.
8
+
9
+ gem "aliyun-odps"
10
+
11
+ And run:
12
+
13
+ $ bundle install
14
+
15
+ Now, [Getting started](./get_start.md)
data/wiki/instances.md ADDED
@@ -0,0 +1,32 @@
1
+ ## Instance
2
+
3
+ Instance is use created job instance, it support: SQL, SQLPLAN, MapReduce, DT, PLSQL now.
4
+
5
+ we can define different task to create instance.
6
+
7
+ + SQL Task
8
+
9
+ To create a SQL Task:
10
+
11
+ task = Aliyun::Odps::InstanceTask.new(name: 'SqlTask', comment: 'sql task', query: 'select * from test_table1;', type: 'SQL')
12
+ instance = prj.instances.create([task])
13
+
14
+ # wait for it terminate
15
+ instance.wait_for_terminated
16
+
17
+ # puts the results
18
+ task_result = instance.task_results[task.name]
19
+
20
+ puts task_result.result
21
+
22
+ # write task result into csv file
23
+ File.open('test.csv', "w") {|f| f.write task_result.result['__content__'] }
24
+
25
+
26
+ To create other task, just replace SQL to your expected type, note the supported list metioned before.
27
+
28
+
29
+ Want to write more powerful SQL, visit [How to use ODPS SQL](https://help.aliyun.com/document_detail/odps/SQL/summary.html?spm=5176.docodps/SQL/ddl.3.2.LdRubj)
30
+
31
+
32
+ After Instance, let's visit [Table Tunnel](./tunnels.md), which can help us upload and download data with ODPS table.
data/wiki/projects.md ADDED
@@ -0,0 +1,51 @@
1
+ ## Project
2
+
3
+ [Project](https://docs.aliyun.com/#/pub/odps/basic/definition&project) is the basic unit to keep your system organized, it's very similar to `database` or `schema` in traditional relational database.
4
+
5
+ Besides the way to get your project you already hear in Readme, here we list some more feature to access your projects.
6
+
7
+
8
+ ### LIST
9
+
10
+ We could list all available projects by calling `Aliyun::Odps.list_projects`.
11
+
12
+
13
+ #list projects by options
14
+ projects = Aliyun::Odps.list_projects(
15
+ owner: 'owner', #specify the project owner
16
+ marker: 'marker', #specify marker for paginate
17
+ maxitems: '1'
18
+ ) #specify maxitems in this request, default 1000
19
+
20
+
21
+ It will support three options `owner`, `marker` and `maxitem`, see the usage by example.
22
+
23
+ Besides, it has a alias `Aliyun::Odps.projects`, choose your favorite style!!
24
+
25
+
26
+ ### GET
27
+
28
+ As you known, Aliyun::Odps.project can get your default project, but if you pass a name, you can get other project quickly.
29
+
30
+ We could call `Aliyun::Odps.project` to fetch a project information
31
+
32
+
33
+ project = Aliyun::Odps.project #get the default project
34
+ project = Aliyun::Odps.project('my_other_project') #fetch a project by name
35
+
36
+ If no project name was provide, it will return the default project which was configured by `Aliyun::Odps.configure`
37
+
38
+ If you pass a noexist project name, it will raise `Aliyun::Odps::RequestError: NoSuchObject: ODPS-0420111`.
39
+
40
+ By the way, in our library, all request to odps fail, will raise `Aliyun::Odps::RequestError`. For more detail, visit [Error](./error.md)
41
+
42
+
43
+ ## UPDATE
44
+
45
+ Right now you are only able to update a project comment by update method, passing by options
46
+
47
+
48
+ #update project by options
49
+ Aliyun::Odps.update_project('my_project_name', comment: 'just for test')
50
+
51
+ Now, let's go to [Tables](./tables.md)
data/wiki/resources.md ADDED
@@ -0,0 +1,62 @@
1
+ ## Resources
2
+
3
+ Resources is a important part for ODPS.
4
+
5
+ To List all resources:
6
+
7
+ project.resources.list
8
+
9
+ In ODPS, resource type can be: py, jar, archive, file and table.
10
+
11
+ ### Create Resource
12
+
13
+
14
+ #### file resource
15
+
16
+ py, jar, archive, file are file resource, now let's take a look for below example to create a file resource:
17
+
18
+ resource = project.resources.create('file', 'test_file_resource', file: "path/to/resource.rb")
19
+
20
+ # or
21
+
22
+ resource = project.resources.create('file', 'test_file_resource', file: "puts 'hello world'")
23
+
24
+
25
+ #### Table resource
26
+
27
+ To create a resource with exist table exist in project, pass table with table path:
28
+
29
+ resource = project.resources.create('table', 'test_table_resource', table: "test_table partition(partname='part1')")
30
+
31
+ please note the table argument format:
32
+
33
+ + "tablename partition(col=value,[col=value])" for table with partition
34
+ + "tablename" for table without any partitions
35
+
36
+ ### Get Resource
37
+
38
+ To get resource content:
39
+
40
+ resource = project.resources.get('test_file_resource')
41
+
42
+ File.open(resource.name, "w") do |f|
43
+ f.write resource.content
44
+ end
45
+
46
+ To get resource information except content:
47
+
48
+ resource = project.resources.head('test_file_resource')
49
+
50
+
51
+ ### Others
52
+
53
+ Besides, you can update resource:
54
+
55
+ resource = project.resources.update('file', 'test_file_resource', file: "puts 'hello rio'")
56
+
57
+ To delete a resource:
58
+
59
+ project.resources.delete('test_file_resource')
60
+
61
+
62
+ After resource, let's visit [Functions](./functions.md)
data/wiki/ssl.md ADDED
@@ -0,0 +1,7 @@
1
+ ## SSL Connection
2
+
3
+ We support SSL connection in two ways:
4
+
5
+ 1. just encrypt request but server not verify cert, just config with https endpoint, eg: https://service.odps.aliyun.com/api'
6
+
7
+ 2. To enable server verify your request, config ssl_ca_file in addition, eg: config.ssl_ca_file = 'path/to/your/ssl/ca/file'