RubyGems - rdbi - Versions diffs - 0.9.0 - Mend

rdbi 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

data/lib/rdbi/schema.rb ADDED

@@ -0,0 +1,121 @@
+RDBI::Schema = Struct.new(
+  :columns,
+  :tables,
+  :type
+)
+RDBI::Column = Struct.new(
+  :name,
+  :type,
+  :ruby_type,
+  :precision,
+  :scale,
+  :nullable,
+  :metadata,
+  :default,
+  :table
+)
+#
+# RDBI::Schema is the metadata representation of a single schema "object", such
+# as the schema for a single table or the data queried against during
+# RDBI::Statement execution.
+#
+# RDBI::Schema is the foundation for type management via RDBI::Type, and as a
+# result an incomplete schema will lead to type inconsistency. As a result, it
+# is *critical* that driver authors implement RDBI::Schema properly.
+#
+# RDBI::Schema is a core Struct underneath the hood and will respond accordingly.
+#
+class RDBI::Schema
+  ##
+  # :attr_reader: columns
+  #
+  # Array of RDBI::Column objects associated with this schema.
+  #
+  ##
+  # :attr_reader: tables
+  #
+  # Array of table names and views associated with this schema, represented as symbols.
+  ##
+  # :attr_reader: type
+  #
+  # In some instances, the type (freeform, String) may be provided as an
+  # optional hint as to what kind of schema this is.
+  #
+end
+#
+# RDBI::Column is the metadata representation of a single table column. You
+# will typically access this via RDBI::Schema.
+#
+# In tables, columns can represent the columns of the schema. In queries,
+# columns can represent anything that identifies the column of a result set.
+# This includes aggregates, other functions, dynamic queries, etc.
+#
+class RDBI::Column
+  ##
+  # :attr_reader: name
+  #
+  # The name of the column, as symbol.
+  ##
+  # :attr_reader: type
+  #
+  # The database-specific type, as symbol.
+  ##
+  # :attr_reader: ruby_type
+  #
+  # The ruby target type, as symbol. This is used by RDBI::Type to convert data
+  # in rows.
+  ##
+  # :attr_reader: precision
+  #
+  # The precision of the type. This is typically the first number in an
+  # extended type form, such as +NUMBER(1)+.
+  #
+  # Precisions are not always *really* precision and this depends on the type.
+  # Consult your database documentation for more information.
+  #
+  ##
+  # :attr_reader: scale
+  #
+  # The scale of the type. This is typically the second number in an extended
+  # type form, such as +NUMBER(10,2)+.
+  #
+  # As with precision, this may not *really* be scale and it is recommended you
+  # consult your database documentation for specific, especially non-numeric,
+  # types.
+  #
+  ##
+  # :attr_reader: nullable
+  #
+  # Boolean: does this column accept null?
+  ##
+  # :attr_reader: metadata
+  #
+  # Free-form field for driver authors to provide data that lives outside of
+  # this specification.
+  ##
+  # :attr_reader: default
+  #
+  # The value provided to the column when it is not specified but requested for
+  # use, such as in +INSERT+ statements.
+  #
+  ##
+  # :attr_reader: table
+  #
+  # The table this column belongs to, if known, as symbol.
+  #
+end
+# vim: syntax=ruby ts=2 et sw=2 sts=2

data/lib/rdbi/statement.rb ADDED

@@ -0,0 +1,204 @@
+#
+# RDBI::Statement is the encapsulation of a single prepared statement (query).
+# A statement can be executed with varying arguments multiple times through a
+# facility called 'binding'.
+#
+# == About Binding
+#
+# Binding is the database term for facilitating placeholder replacement similar
+# to formatters such as "sprintf()", but in a database-centric way:
+#
+#   select * from my_table where some_column = ?
+#
+# The question mark is the placeholder here; upon execution, the user will be
+# asked to provide values to fill that placeholder with.
+#
+# There are two major advantages to binding:
+#
+# * Multiple execution of the same statement with variable data
+#
+# For example, the above statement could be executed 12 times over an iterator:
+#
+#   # RDBI::Database#prepare creates a prepared statement
+#
+#   sth = dbh.prepare('select * from my_table where some_column = ?')
+#
+#   # there is one placeholder here, so we'll use the iterator itself to feed
+#   # to the statement at execution time.
+#   #
+#   # This will send 12 copies of the select statement above, with 0 - 11 being
+#   # passed as the substitution for each placeholder. Use
+#   # RDBI::Database#preprocess_query to see what these queries would look
+#   # like.
+#
+#   12.times do |x|
+#     sth.execute(x)
+#   end
+#
+# * Protection against attacks such as SQL injection in a consistent way (see below).
+#
+# == Native client binding
+#
+# Binding is typically *not* just text replacement, it is a client-oriented
+# operation that barely involves itself in the string at all. The query is
+# parsed by the SQL engine, then the placeholders are requested; at this point,
+# the client yields those to the database which then uses them in the
+# *internal* representation of the query, which is why this is totally legal in
+# a binding scenario:
+#
+#   # RDBI::Database#execute is a way to prepare and execute statements immediately.
+#   dbh.execute("select * from my_table where some_column = ?", "; drop table my_table;")
+#
+# For purposes of instruction, this resolves to:
+#
+#   select * from my_table where some_column = '; drop table my_table;'
+#
+# *BUT*, as mentioned above, the query is actually sent in two stages. It gets this:
+#
+#   select * from my_table where some_column = ?
+#
+# Then a single element tuple is sent:
+#
+#   ["; drop table my_table;"]
+#
+# These are combined with *post-parsing* to create a full, legal statement, so
+# no grammar rules can be exploited.
+#
+# As a result, placeholder rules in this scenario are pretty rigid, only values
+# can be used. For example, you cannot supply placeholders for:
+#
+# * table names
+# * sql keywords and functions
+# * other elements of syntax (punctuation, etc)
+#
+# == Preprocessing
+#
+# Preprocessing is a fallback mechanism we use when the underlying database
+# does not support the above mechanism. It, unlike native client binding, is
+# basically text replacement, so all those rules about what you can and cannot
+# do go away.
+#
+# The downside is that if our replacement system (provided by the Epoxy class,
+# which itself is provided by the epoxy gem) is unkempt, SQL injection attacks
+# may be possible.
+#
+class RDBI::Statement
+  extend MethLab
+  # the RDBI::Database handle that created this statement.
+  attr_reader :dbh
+  # The query this statement was created for.
+  attr_reader :query
+  # A mutex for locked operations. Basically a cached copy of Mutex.new.
+  attr_reader :mutex
+  # The input type map provided during statement creation -- used for binding.
+  attr_reader :input_type_map
+  ##
+  # :attr_reader: last_result
+  #
+  # The last RDBI::Result this statement yielded.
+  attr_threaded_accessor :last_result
+  ##
+  # :attr_reader: finished
+  #
+  # Has this statement been finished?
+  ##
+  # :attr_reader: finished?
+  #
+  # Has this statement been finished?
+  inline(:finished, :finished?)   { @finished  }
+  ##
+  # :attr_reader: driver
+  #
+  # The RDBI::Driver object that this statement belongs to.
+  inline(:driver)                 { dbh.driver }
+  #
+  # Initialize a statement handle, given a text query and the RDBI::Database
+  # handle that created it.
+  #
+  def initialize(query, dbh)
+    @query = query
+    @dbh   = dbh
+    @mutex = Mutex.new
+    @finished = false
+    @input_type_map = RDBI::Type.create_type_hash(RDBI::Type::In)
+    @dbh.open_statements.push(self)
+  end
+  #
+  # Execute the statement with the supplied binds.
+  #
+  def execute(*binds)
+    raise StandardError, "you may not execute a finished handle" if @finished
+    # XXX if we ever support some kind of hash type, this'll get ugly.
+    hashes, binds = binds.partition { |x| x.kind_of?(Hash) }
+    if hashes
+      hashes.collect! do |hash|
+        newhash = { }
+        hash.each do |key, value|
+          newhash[key] = RDBI::Type::In.convert(value, @input_type_map)
+        end
+        newhash
+      end
+    end
+    binds = (hashes || []) + binds.collect { |x| RDBI::Type::In.convert(x, @input_type_map) }
+    mutex.synchronize do
+      exec_args = *new_execution(*binds)
+      self.last_result = RDBI::Result.new(self, binds, *exec_args)
+    end
+  end
+  #
+  # Deallocate any internal resources devoted to the statement. It will not be
+  # usable after this is called.
+  #
+  # Driver implementors will want to subclass this, do their thing and call
+  # 'super' as their last statement.
+  #
+  def finish
+    mutex.synchronize do
+      dbh.open_statements.reject! { |x| x.object_id == self.object_id }
+      @finished = true
+    end
+  end
+  ##
+  # :method: new_execution
+  # :call-seq: new_execution(*binds)
+  #
+  # Database drivers will override this method in their respective RDBI::Statement
+  # subclasses. This method is called when RDBI::Statement#execute or
+  # RDBI::Database#execute is called.
+  #
+  # Implementations of this method must return, in order:
+  #
+  # * A RDBI::Cursor object which encapsulates the result
+  # * a RDBI::Schema struct which represents the kinds of data being queried
+  # * a +type_hash+ for on-fetch conversion which corresponds to the
+  #   RDBI::Column information (see RDBI::Schema) and follows a structure similar
+  #   to RDBI::Type::Out
+  #
+  # These return values are passed (along with this object and the binds passed
+  # to this call) to RDBI::Result.new.
+  #
+  inline(:new_execution) do |*args|
+    raise NoMethodError, "this method is not implemented in this driver"
+  end
+end
+# vim: syntax=ruby ts=2 et sw=2 sts=2

data/lib/rdbi/types.rb ADDED

@@ -0,0 +1,200 @@
+require 'typelib'
+require 'typelib/canned'
+#
+# == RDBI::Type -- manage types going to and coming from your database.
+#
+# RDBI::Type consists of:
+#
+# * Checks and Conversions (facilitated by TypeLib) for ruby -> database and
+#   database -> ruby
+# * Mappings for Input (Ruby -> DB) and Output (DB -> Ruby) conversions based
+#   on type.
+# * Convenience methods for TypeLib and creating new mappings.
+#
+# == How does it all work?
+#
+# RDBI::Type leverages +TypeLib+ which is a filter chaining system, one which
+# you'll wish to read the documentation for to understand some of the concepts
+# here.
+#
+# === A conversion follows these steps:
+#
+# * Metadata on the type (more below) is located and used to reference a
+#   TypeLib::FilterList which contains the TypeLib::Filters (which in turn
+#   consist of a +Check+ and +Conversion+ proc) which will process your data.
+# * Data is passed to the FilterList and it is executed, following each filter
+#   in turn and following any conversion passing checks request. This may very
+#   well mean that no checks pass and therefore your original data is returned.
+# * After processing, the data is yielded back to you for further processing
+#   (or a subsystem such as RDBI::Result#fetch and result drivers that take
+#   advantage of said data)
+#
+# === How is metadata located?
+#
+# It's important first to briefly describe how RDBI terms database I/O:
+#
+# * Binds going to the database proper are called 'input'.
+# * Data coming from the database is called 'output'.
+#
+# Mappings are keyed by type metadata and thusly are typed as:
+#
+# * Input types are the native class of the type.
+# * Output types are a symbol that represents the database type. These type
+#   names are provided by RDBI::Column via RDBI::Schema in the response from an
+#   execution. See RDBI::Statement#new_execution and RDBI::Column#ruby_type.
+#
+# Note that in the latter case these database types are effectively normalized,
+# e.g., 'timestamp with timezone' in postgres is just +:timestamp+. You will
+# want to read the mappings in the source to get a full listing of what's
+# supported by default.
+#
+# Each map will also contain +:default+, which is what is used when a proper
+# lookup fails, as a fallback.
+#
+# === Ok, so how do I use these maps?
+#
+# RDBI::Type.create_type_hash is a helper to duplicate the default maps and
+# return them. If you don't wish to use the default maps at all, just a plain
+# old +Hash+ following the semantics above will work.
+#
+# To perform conversions, look at RDBI::Type::In::convert and
+# RDBI::Type::Out::convert.
+#
+module RDBI::Type
+  # A filter format to assist the conversions of DateTime objects.
+  DEFAULT_STRFTIME_FILTER = "%Y-%m-%d %H:%M:%S %z"
+  # Module for canned checks that are unique to RDBI. Includes the canned
+  # checks from TypeLib.
+  module Checks
+    include TypeLib::Canned::Checks
+    IS_NULL       = proc { |obj| obj.nil? }
+    IS_BIGDECIMAL = proc { |obj| obj.kind_of?(BigDecimal) }
+    IS_DATETIME   = proc { |obj| obj.kind_of?(DateTime) }
+    IS_BOOLEAN    = proc { |obj| obj.kind_of?(TrueClass) or obj.kind_of?(FalseClass) }
+    STR_IS_BOOLEAN = proc { |obj| obj.kind_of?(String) and obj =~ /^(t(rue)?|f(alse)?|1|0)$/i }
+  end
+  # Module for canned conversions that are unique to RDBI. Includes the canned
+  # conversions from TypeLib.
+  module Conversions
+    include TypeLib::Canned::Conversions
+    TO_NULL            = proc { |obj| nil }
+    TO_STRING_DECIMAL  = proc { |obj| obj.to_s('F') }
+    TO_STRING_DATETIME = proc { |obj| obj.strftime(DEFAULT_STRFTIME_FILTER) }
+    TO_STRING_BOOLEAN  = proc { |obj| obj ? 'TRUE' : 'FALSE' }
+    SQL_STR_TO_BOOLEAN = proc { |obj|
+      case obj
+        when /^(t(rue)?|1)$/i
+          true
+        when /^(f(alse)?|0)$/i
+          false
+      end
+    }
+  end
+  # Canned +TypeLib::Filter+ objects unique to RDBI to facilitate certain
+  # conversions. Includes TypeLib's canned filters.
+  module Filters
+    include TypeLib::Canned::Filters
+    NULL           = TypeLib::Filter.new(Checks::IS_NULL,       Conversions::TO_NULL)
+    FROM_INTEGER   = TypeLib::Filter.new(Checks::IS_INTEGER,    Conversions::TO_STRING)
+    FROM_NUMERIC   = TypeLib::Filter.new(Checks::IS_NUMERIC,    Conversions::TO_STRING)
+    FROM_DECIMAL   = TypeLib::Filter.new(Checks::IS_BIGDECIMAL, Conversions::TO_STRING_DECIMAL)
+    FROM_DATETIME  = TypeLib::Filter.new(Checks::IS_DATETIME,   Conversions::TO_STRING_DATETIME)
+    FROM_BOOLEAN   = TypeLib::Filter.new(Checks::IS_BOOLEAN,    Conversions::TO_STRING_BOOLEAN)
+    TO_BOOLEAN     = TypeLib::Filter.new(Checks::STR_IS_BOOLEAN, Conversions::SQL_STR_TO_BOOLEAN)
+  end
+  # Shorthand for creating a new +TypeLib::FilterList+.
+  def self.filterlist(*ary)
+    TypeLib::FilterList.new([Filters::NULL, *ary])
+  end
+  # Shorthand to duplicate the +DEFAULTS+ hash from a module. Most frequently
+  # used to get a copy of the RDBI::Type::In and RDBI::Type::Out type maps.
+  def self.create_type_hash(klass)
+    hash = { }
+    klass::DEFAULTS.each do |key, value|
+      flist = filterlist()
+      value.each do |filter|
+        flist << filter
+      end
+      hash[key] = flist
+    end
+    return hash
+  end
+  #
+  # The default output type map. As explained in RDBI::Type, these are keyed by
+  # symbol and are loosely related to the type, and are compared against the
+  # proper RDBI::Column object to figure out which filter to call.
+  #
+  module Out
+    DEFAULTS = {
+      :integer     => RDBI::Type.filterlist(Filters::STR_TO_INT),
+      :decimal     => RDBI::Type.filterlist(Filters::STR_TO_DEC),
+      :datetime    => RDBI::Type.filterlist(TypeLib::Canned.build_strptime_filter(DEFAULT_STRFTIME_FILTER)),
+      :timestamp   => RDBI::Type.filterlist(TypeLib::Canned.build_strptime_filter(DEFAULT_STRFTIME_FILTER)),
+      :boolean     => RDBI::Type.filterlist(Filters::TO_BOOLEAN),
+      :default     => RDBI::Type.filterlist()
+    }
+    #
+    # Perform a conversion. Accepts the object to convert, a RDBI::Column
+    # object, and a type map (a +Hash+).
+    #
+    def self.convert(obj, column, type_hash)
+      fl = type_hash[column.ruby_type]
+      unless fl
+        fl = type_hash[:default]
+      end
+      fl.execute(obj)
+    end
+  end
+  #
+  # The default input type map. As explained in RDBI::Type, these are keyed by
+  # the Ruby type with the exception of +:default+ which is a fallback
+  # conversion. RDBI::Statement subclassers will normally provide this object
+  # via +@input_type_map+ at construction time.
+  #
+  module In
+    DEFAULTS = {
+      Integer    => RDBI::Type.filterlist(Filters::FROM_INTEGER),
+      Fixnum     => RDBI::Type.filterlist(Filters::FROM_INTEGER),
+      Float      => RDBI::Type.filterlist(Filters::FROM_NUMERIC),
+      BigDecimal => RDBI::Type.filterlist(Filters::FROM_DECIMAL),
+      DateTime   => RDBI::Type.filterlist(Filters::FROM_DATETIME),
+      TrueClass  => RDBI::Type.filterlist(Filters::FROM_BOOLEAN),
+      FalseClass => RDBI::Type.filterlist(Filters::FROM_BOOLEAN),
+      :default   => RDBI::Type.filterlist()
+    }
+    #
+    # Perform a conversion. Accepts the object to convert and a type map (a
+    # +Hash+).
+    #
+    def self.convert(obj, type_hash)
+      fl = type_hash[obj.class]
+      unless fl
+        fl = type_hash[:default]
+      end
+      fl.execute(obj)
+    end
+  end
+end
+# vim: syntax=ruby ts=2 et sw=2 sts=2