Chapter 4. Modules and Functions
The three basic building blocks of a Python program are modules, functions, and classes. This chapter will discuss modules and functions, while the next chapter will discuss classes. A Python module is a collection of statements that define variables, functions, and classes, and that is the primary unit of a Python program for the purposes of importing code. Importing a Python module actually executes the module. A function in Python is similar to functions or methods in most programming languages. Python offers a rich and flexible set of mechanisms for passing values to functions.
Modules
Python helps you to organize your programs by using modules. You can split your code among several modules, and the modules can be further organized into packages. Modules are the structural units of Python programs. In Java, this structural role is played directly by classes, and there is a strict correspondence between a file’s name and the class it contains. In Python, the filename is used only for organization, and does not require specific names to be used for any object within that file.
A Python module corresponds to a source code file containing a series of top-level statements, which are most often definitions of functions and classes. These statements are executed in sequence when the module is loaded. A module is loaded either by being passed as the main script when the interpreter is invoked or when the module is first imported by another module. This is the typical execution model for scripting languages, where you can take a recipe and easily try it out, but in Python the recipe can be more elegantly expressed using an object-oriented vocabulary.
At runtime, modules will appear as first-class namespace objects. A namespace is a dictionary-like mapping between identifiers and objects that is used for variable name lookup. Because modules are first-class objects, they can be bound to variable names, passed as arguments to functions, and returned as the result of a function.
Modules are used as the global lexical scope for all statements in the module file—there is no single “global” scope in Python. The value of a variable binding referenced by a statement within a module is determined by looking in the module namespace, or within a local namespace defined within the module (by a function or class definition, for example). The contents of a module are first set up by the top-level statements, which create bindings between a name and a value. Name-binding statements include assignment, import, function definition, and class definition statements. So, although any kind of statement can appear at the top level in a module (there is no such thing as a statement that can appear only at the top level), name-binding statements play a pivotal role.
Unlike Java, all name bindings in Python take place at runtime,
and only as the direct consequence of a name-binding statement. This is
a straightforward model that is slightly different from the Java model,
especially because Python import statements have very different behavior
from their Java counterparts. (Jython does have a compilation phase, but
it is transparent to the user and no Python name bindings are set there.
Jython compiles Python source code to Java bytecode, which is then
dynamically loaded and creates the Python bindings at runtime.) A Java
import statement merely allows a specific class or classes to be used
with unqualified names, while a Python import statement actually
executes the imported file and makes available all the names defined
with it. A name binding in a module is available to other statements in
the module after the binding is created. The dot operator
module
.
attribute
is used to access variable names bound in an imported module.
One consequence of the module and import semantics of Python is that Python does not force you to use object-oriented programming. You can mix procedural-style or functional programming in your modules when it makes sense to do so. Although it has been said that procedural programming does not make for very reusable code, this is true mostly for large programs written in statically typed programming languages. However, Python is a dynamically typed language, and because functions have the same first-class status of all other values, even straight procedural code can be reusable. With Python’s dynamically typed functions, you have the benefits of a generic programming paradigm such as C++ function templates, but with a simpler and more powerful model. In Python, it is easy to write common utility algorithms that take functions as arguments. Functions of this sort are sometimes called “pluggable,” and are very easy to reuse.
In the next sections, we will cover how to write function
definition statements and how to use functions and import statements for
Python modules and Java classes. Object-oriented class definition and
semantics will be discussed in Chapter 5. The other name-binding statements,
including the assignment statement and the for
statement, are covered in Assignment and Loops in Chapter 3.
Functions
The simplest form of a function definition is as follows:
deffunid
([arg
,...]
):block
At runtime this will create a function object and bind it to the
name funid
. When called, the function object
will execute the block of statements, which can refer to the variables
defined in the argument list. These will be bound to the values of the
actual arguments as computed when the function is called. Functions are
called in the usual way, through the ( )
call
operator:
callable-object
([expr
,...]
)
The calling convention, pass-by-value, is the same as for objects in Java, where the value being passed is the object reference, not the underlying object (this convention is sometimes called pass-by-object-reference). This means that changes to the underlying object will be visible outside the function; however, changes to the variable binding (such as reassigning it within the function) will not be visible outside the function. As in Java, there is no direct support for pure call-by-reference.
There is nothing magical about function objects in Python that enables them to be called using the call operator. As we will see in Special Methods in Chapter 5, any Python class can be defined to respond to the call operator, essentially allowing an instance of that class to mimic a function object.
Because Python is a dynamically typed language, there are no type
declarations for the arguments or the return value in a function
definition statement. Python does not support function overloading, as
Java or C++ do (but see Parameter Passing with Style
later in this chapter for the Python equivalent). A subsequent function
definition for funid
, like any other
name-binding statement, will simply rebind
funid
without triggering an error. This is
true even if the later binding is just a variable assignment—you cannot
have functions and variables with the same name within a Python
namespace. On the other hand, a function in Python is fully generic, in
that any set of arguments can be passed to it. If the objects passed as
arguments do not support the operations performed inside the function, a
runtime error will be raised.
All functions return a value, unless they are abandoned because of a raised exception. A return statement:
return [expr]
can be used to force the return of control to the caller and
specify the return value as the one computed by
expr
. In the case of a bare return without an
expression, or if the end of a function is reached without returning a
value explicitly, the function will return the value
None
.
In Python, it is also possible to return multiple values from a
function by building a tuple on the fly—for example, return
head, tail
. Then you can use unpacking assignment at the call
site or work directly with the tuple.
You also do not need to declare local variables in Python, but of course there are local variables. A variable is treated as local in a function (and more generally as local in any scope) if it is bound for the first time through any of the name-binding statements within the function or scope. Arguments are also local, and they are implicitly bound at call time. You will find more information on scoping rules in the Scoping Rules section, later in this chapter.
Tip
CPython 2.2 introduces a special case of function called a
generator. A generator is defined the same way as
a regular function, but instead of a return statement, it uses the new
keyword yield
expr
. When
a generator is called, it returns a generator object. The generator
object supports the iterator protocol. When that
iterator’s next
function is invoked, it executes
until it encounters a yield
statement, at which
time it returns the value of the expression in that statement. When
the generator object is invoked again from the same scope, it
continues execution from the point of the yield
,
essentially saving the state of all its local variables. The generator
runs until it encounters a yield
again, or until it
exits normally. Generator functions can be placed anywhere iterators
can, including the righthand side of a for
statement. A generator is called repeatedly until it either raises a
StopIter
exception, or exits normally.
For example, the following simple generator returns an increasing range of integers one at a time:
def generateInts(N): for i in range(N): yield i
Parameter Passing with Style
Python supports many useful features related to parameter passing, through fancier argument specifiers. All these features are absent from Java.
First, you can specify a default value for an argument, which
makes the argument optional. Just add the value after the
arg
in the argument list like so:
arg
=
expr
(by stylistic convention, you do not put spaces around the equals sign
in the argument list). The expression is evaluated once and only once
when the function definition statement is executed. It is not
re-executed every time the function is called. If the expression value
is a mutable object (e.g., a list), it is shared by all the function
invocations. Therefore, you need to be careful, because any changes you
make to the object in place (such as using append( )
)
are then visible to future function calls. Here is some code that shows
a default argument in action.
a = 2 def func(x=a): print x func( ) func(1) a = 3 func( ) 2 1 2
In this example, the rebinding of a = 3
does
not affect the binding in x=a
in the function
statement, because the x=a
binding was executed first
and is not re-executed on each function call. A typical idiom that you
can use to cope with problems that can arise when trying to use an
optional, mutable list argument is to create a new copy of the argument
each time:
def manip(..., l=None): if l is None: l = [1,2,3] ...
A default value can depend only on variable names that are valid in the namespace when the function is defined—it cannot depend on the other arguments to the function. If you make an argument optional, all subsequent arguments in the function must also have default values.
Python has a richer syntax than Java for passing arguments when a
function is called. At call time, you can include any argument in the
call using the same
arg
=
expr
style. For example, the function func
in the previous
example could be called as func(x=3)
. This is called
a keyword argument. Keyword arguments can be in any
order (but must come after the standard arguments), and their
expressions are passed to the appropriate argument.
The keyword argument syntax works for all user-defined functions, but unfortunately does not work for many of the Python built-in functions. You cannot use keyword arguments when calling a Java method from Jython, since ordinary Java compilation causes the loss of variable name information (there is a partial exception for constructors; see Using Beans in Jython in Chapter 8). If you have experience with calls to heavily overloaded Java methods with many arguments, you can see that appropriate use of defaults and keyword arguments can increase code readability and clarity for your Python code.
If you end a function’s argument list with a name that has a
*
in front of it, such as
*
rest
,
rest
captures in a tuple any excess arguments
passed to the function. By using this syntax, a function can take a
variable number of arguments. You can also have a second catch-all name
at the end of your list, with **
in front. The
double-star argument captures in a dictionary any keyword arguments that
are not already specified in the argument list. If both of these
argument types exist in the function, the tuple argument must come
first.
Here is a summary of the complete function definition syntax:
deffunid
([arg
,...[
,*rest[
,**kwargs]]]
)block
The following function will be used to make the syntax clearer:
def func(a, b=0, c="fred", *d, **e): print a, b, c, d, e
Ordinary arguments are bound left to right and defaults are filled in. Notice that the catch-all arguments are empty:
func(1, 2, 3) func(1, 2) func(1) 1 2 3 ( ) {} 1 2 fred ( ) {} 1 0 fred ( ) {}
Keyword arguments are explicitly bound to the named argument after
the ordinary arguments are bound from left to right. At the end of the
call, all arguments (except the catch-alls) must have either a default,
an ordinary argument, or a keyword argument. In the preceding code, the
argument a
must be bound either with a keyword
argument, or by having the call start with an ordinary argument:
func(1, c=3) func(1, 2, c=3) func(b=2, a=1) 1 0 3 ( ) {} 1 2 3 ( ) {} 1 2 fred ( ) {}
Finally, assuming that all the listed arguments are filled, the catch-all arguments grab any extras:
func(1, 2, 3, 4, 5, 6) func(1, 2, c=12, f="hi", g="there") func(1, 2, 3, 4, g="there") 1 2 3 (4, 5, 6) {} 1 2 12 ( ) {'g': 'there', 'f': 'hi'} 1 2 3 (4,) {'g': 'there'}
Scoping Rules
Function definitions can be nested. The block of statements in a function definition (or class definition) are placed in a local scope in the same way that top-level statements are placed in the global scope. Ordinary control-flow statements and list comprehensions do not introduce new scopes.
From the beginning of time until Version 2.1, Python variable names were resolved this way:
If the name is bound by some name-binding statement in the current scope, all usage of that name in the scope refers to the binding in the local scope. This is enforced during the transparent compilation phase. If such a name is used at runtime before it is actually bound, this produces an error.
Otherwise, a possible binding of the name in the global scope is checked (at runtime) and if such a binding exists, this is used.
If there is no global binding, Python looks in the built-in namespace (which corresponds to the built-in module
__builtin__
). If this also fails, an error is issued.
Under these rules, there are only three scopes: local, global,
and __builtin__
. Scopes do not nest. Therefore, a
function whose definition is nested inside another function cannot
refer to itself recursively because its name is bound in the enclosing
namespace, which is in neither the inner function’s scope nor the
global or built-in scopes. In addition, names in the enclosing
namespace, but not in the global namespace, also cannot be used. The
following code shows the potential for “gotchas.”
def outerFunc(x, y): def innerFunc(z): if z > 0: print z, y innerFunc(z - 1) innerFunc(x) outerFunc(3, "fred")
This code has two name errors that a Java programmer may not be
expecting. The use of y
in the print statement is a
name error because y
is only defined in the
outerFunc
scope, not in the local or global scope.
The usual workaround in this case is to change the definition of
innerFunc
to def innerFunc(z,
y=y):
, which works but is undeniably awkward. Even with that
workaround, the next line containing the call to
innerFunc
is also a name error for the same reason.
In practice, this is not much of an issue (unless you use
lambda
expressions a lot, there is rarely a reason
why functions need to be nested in Python).
With Version 2.2 of CPython, however, new rules will replace the
old ones, allowing access from one scope to binding in the enclosing
scopes in the way that a Java programmer would expect. The new rules
can already be activated in Jython 2.1 and CPython 2.1 on a per-module
basis, putting the following __future__
statement
before any other statement in the module:[7]
from __future__ import nested_scopes
With the new rules, if a name is locally bound in a scope by some statement in that scope, then every use in the scope refers to this binding. If not, the binding is called free.
A free name, when used, refers to the binding in the nearest enclosing scope that contains a binding for that name, ignoring scopes introduced by class definition. If no such explicit binding exists at compile time, Python tries to resolve the name at runtime, first in the global scope and then in the built-in namespace.
The practical meaning of the new rule is that when it is
created, an inner function gets a frozen copy of any referenced outer
bindings (identifiers only—the values are not copied) as they exist at
that time. With these rules, it is still impossible for the inner
scope code to modify those outer bindings; using a name-binding
statement simply creates a new local binding. In this way, Python
diverges from most other languages with nested scopes. Under the new
rules, the function outerFunc
will work
perfectly.
Under both sets of rules, you can always force a name to refer to the global/built-in binding, even if it occurs in a more local name-binding statement, by using the global declaration statement:
globalname[
,...]
Flying First Class
We have already mentioned more than once that Python functions are first-class objects. They can be stored in variables, returned from other functions, and passed around the same as any other object. The function definition statement simply creates such an object and binds it to a name.
It is also possible to create a function object without binding
it to a name using the lambda
operator, which was
briefly introduced in Functional Programming in
Chapter 2. The syntax of a
lambda
is a little different from the
def
statement:
lambdaargs
:expr
The argument list args
has the same
structure as the argument specifier of a def
statement. Because lambda
is an operator, not a
statement, it can appear inside an expression and produces an
anonymous function object. This function can be called like any other
function. When called, it evaluates the expression
expr
using the actual values of the
arguments and returns the obtained result. Therefore, the following
bits of code are equivalent:
fun_holder = lambdaargs
:expr
def fun_holder(args): return expr
And both versions are called using the syntax:
fun_holder(args
)
We will use functions as first-class objects often throughout this book. However, the full implications for program design of having first-class functions are beyond our scope. We’ll limit ourselves to some more reference material and two examples.
Python offers a built-in function that enables you to dynamically call a function object (or any callable object) with a computed set of arguments. This is useful in the case where you do not even know exactly how many or which arguments will be used at compile time. It is also useful at times when your arguments have been calculated in a sequence or dictionary, but the function being called expects the arguments separately. The syntax is:
apply(function[
,args [
,kwargs]]
)
where args
should be a sequence
specifying the positional arguments, and
kwargs
is a dictionary for the keyword
arguments. Both are optional. For example, use the same function we
used before:
def func(a, b=0, c="fred", *d, **e): print a, b, c, d, e samefunc = func samefunc(1, 2, 3) t = (1, 2, 3) kw1 = { "g": "hi" } apply(samefunc, t) #equivalent to samefunc(1, 2, 3) apply(samefunc, t[:2], kw1) #equivalent to samefunc(1, 2, g="hi") apply(samefunc, (1,), {"b": "hi"}) #equivalent to samefunc(1, b="hi") 1 2 3 ( ) {} 1 2 3 ( ) {} 1 2 fred ( ) {'g': 'hi'} 1 hi fred ( ) {}
The sequence argument to apply
is treated as
though the arguments were listed one at a time and left to right. The
dictionary argument to apply
then works as though
each key/value pair in the dictionary was called as a keyword
argument.
In Jython 2.0 and later (CPython introduced this in 1.6), you
can also pass sequences or dictionaries in the same “exploded” manner
that apply
uses by placing *
for
sequences or **
for dictionaries before the
argument at call time. The arguments are applied to the function
exactly as they would be in apply
. So, the
apply
calls in the preceding code could also be
written as:
samefunc(*t) samefunc(*t[:2], **kw1) samefunc(*(1,), **{"b": "hi"})
and would return the same results. This is purely syntactic sugar, but can sometimes be easier to read.
There is also a built-in module operator
that
defines corresponding functions for all Python operators, such as
operator.add
for the +
operator.
By combining apply
and operator
,
you can mimic any set of static actions at runtime using dynamic
functions or operators. The typical use of this module is to minimize
the need to use lambda
statements when using the
reduce( )
function.
The following example shows how to construct a set of function objects dynamically for building some HTML text (in a rather simple-minded way) through the use of a helper function. The example shows how you might use nested function definitions, nested scopes, and first-class functions.
When the outer function is called, it constructs and returns a new function object. Each constructed function deals with a given tag, passed as a parameter to the helper function, and wraps its input between opening and closing versions of the tag, as defined in the outer function. Each constructed tag function can take as input any number of text fragments that will be concatenated and can set arbitrary attributes for the tag, through keyword arguments.
from __future__ import nested_scopes def tag_fun_maker(tag): open = "<%s" % tag close = "</%s>" % tag def tagfunc(*content, **attrs): attrs = ['%s="%s"' % (key, value) for key, value in attrs.items( )] attrs = ' '.join(attrs) return "%s %s>%s%s" % (open,attrs,''.join(content),close) return tagfunc html = tag_fun_maker("html") body = tag_fun_maker("body") strong = tag_fun_maker("strong") anchor = tag_fun_maker("a") print html(body( "\n", anchor(strong("Hello World from Jython!"), href="http://www.jython.org"), "\n")) <html ><body > <a href="http://www.jython.org"><strong >Hello World from Jython!</strong></a> </body></html>
The main point of this example is that
tag_fun_maker
is able to create a function
dynamically and return it. The example also parameterizes the created
functions using the new nested scope rules, which cause the inner
function to act as a lexical closure, able to refer to variables in
the outer scopes as they existed when the function was created.
Here is an idiom borrowed from Smalltalk that you might call “do around.” Sometimes, you have a resource that needs to be opened, used, and closed in a variety of places, which might cause you to repeat the open and close logic frequently. A typical example is a file.
def fileLinesDo(fileName, lineFunction): file = open(fileName, 'r') result = [lineFunction(each) for each in file.readlines( )] file.close( ) return result
The key here is that you don’t have to rewrite the open and close statements each time you use the file. This is a trivial point for files, but if we added error checking or had a more complicated resource, it would be very useful.
Import Statements and Packages
In Python, modules are either built-in (e.g., the
sys
module) or defined by external files, typically
source files with a .py extension. Python comes
with a set of external modules that form the Python standard library,
which is mostly shared between Python and Jython. Most CPython modules
written in Python work directly in Jython. The C extension modules for
CPython that are written in C and not Python cannot be used from Jython,
although some of these modules have been ported from C to
Java.
External modules are retrieved from a path, which, like Java’s
classpath, is a set of directories. In Python, the path is stored as a
list of strings in the sys.path
attribute of the
sys
module (e.g., ['', 'C:\\.',
'e:\\jython\\Lib', 'e:\\jython']
), which by default should
point at the current working directory and at the Python standard
library directory. The path can be changed for all your Jython modules
by editing the Jython registry (see Appendix B). Also, the
sys.path
variable can be changed dynamically by
Python code.
Python modules can be organized in packages using the directory
structure in a manner similar to Java. But a subdirectory of one of the
directories in sys.path
is considered a package only
if it contains an __init__.py Python source file
and if all its parent directories do as well, up to but excluding the
one in sys.path (
there is no concrete default/root
package in Python). The names of the packages (directories) in the chain
down to the bottommost package (subdirectory) are separated by dots to
form the qualified name of the package (e.g.,
foo1.foo2.foo4
). A module gets a qualified name by
appending its name (the filename without the .py
extension) to the qualified name of its parent package (a top-level
module has no parent package).
Jython 2.1 will also allow .zip and .jar archive files to be placed on the Python path. The exact specification is still not complete as of this writing (and may change due to a proposal to put similar functionality in CPython 2.2). However, the basic idea is that the files and directories compressed in the archive would be treated exactly as though they were an uncompressed part of the filesystem.
At runtime, first-class module objects are created for modules and packages by loading their __init__.py modules. The __init__.py file can contain statements that initialize the package, however, it is often completely empty and serves just as a marker. This is different from Java, in which packages do not have first-class status.
When loading a module, Python ensures that all its parent packages
are loaded as well. Loading for a module happens once, and the loaded
module and package objects are cached in a dictionary stored in
sys.modules
with their qualified names as
keys.
When loaded, a package/module is bound with its name in its parent package’s namespace (if there is a parent package). So, at runtime packages and modules are retrieved through normal attribute lookup. In Python, unlike Java, there is no name resolution against packages or of qualified names at compilation time. Name resolution takes place only at runtime. Requests for modules and packages to be loaded can be issued through import statements, which are name-binding statements executed at runtime.
When a module is loaded, and before the execution of its top-level
code, the identifier __name__
is bound to the
qualified name of the module in the module’s global namespace. This
allows you to access the name by which the module is called by the
outside world. The case of a module passed as main script by directly
invoking the interpreter is special; in that case,
__name__
is always set to
'__main__'
, and not to the actual name of the module.
The following idiom is typical:
if __name__ == '__main__':
... # code
and is often used to put some test code in a library module, or to
give it a working mode as a standalone utility. This idiom is roughly
analogous to the special main()
function in
Java.
Import Statements
Import statements always ensure that their targets are compiled
and loaded along with all the packages and the qualified name of the
target. Moreover, except in the case of the main module called by
invoking the interpreter, Jython catches the results of the
transparent compilation that takes place when a module
foo.py
is loaded in the class file
foo$py.class.
Python offers two different import statements:
import
and from
. They differ in
how they place the imported module within the calling module’s
namespace.
The syntax of import
is:
importqualmod [
asname][
,...]
If qualmod
specifies a bare module,
this module is bound to its name in the current scope. Otherwise, the
top package specified by qualmod
is bound
to its top-level name. For example, import foo.bar
ensures that both the package foo
and the module
foo.bar
are loaded and binds the name
foo
to the foo
module object in
the current scope. Then bar
and its contents are
accessible through attribute lookup. With as
name
, the target module (not the top
package) is bound to name
, and the
top-level package is not bound at all.
fromqualmod
importname1 [
asalias1][
,name2 ...]
The value of
qualmod
.
name1
is bound to name1
in the current scope, or
to alias1
if that is specified. The
qualmod
itself is not bound in the current scope.
These semantics mean that any later rebinding of
qualmod
.
name1
(for example, by reloading qualmod
) will not affect
name1
in importing scope. By using the
from
statement, the names imported are accessible
directly in the calling module without using the dot
operator.
There is a special form of from
:
from qualmod
import *
This statement takes all the bindings in
qualmod
whose names do not start with
'_'
, and binds them to the very same name in the
current scope. If
qualmod
.__all__
exists,
it should be a list of strings, and then only the names listed there
will be considered and bound.
At first, from
statements may seem more
convenient than import
statements because the
variable names are accessible directly, and you don’t have to
continually type the module name. However, you do need to be careful
when using from
. It is possible that a
from ... import *
statement could rebind things
that you don’t want rebound, such as the names of built-in functions.
This is especially true if the from
statement does
not occur at the top of the module. Starting with Python 2.1, a
correct __all__
attribute has been added to most
modules in the standard library, but you should still pay attention to
this problem with your own modules. In Python 2.2, or in Python 2.1
with nested scopes enabled, from ... import *
statements are allowed only at the top level of a module because the
result of the import is ambiguous in a nested scope if bindings from
the imported module shadow an existing reference.
Also, because from
statements create new name
bindings separate from the module name bindings, you cannot see
changes made in the imported module when it is reloaded. Although this
is not usually a problem in running code, it can be a significant
annoyance if you are developing using an interactive session—you will
continually find that modules are not seeing other module
changes.
Both import
and from
work
by calling a built-in special function __import__
,
which looks like this:
__import__(moduleName[
,globals [
,locals [
,fromlist]]]
)
Internally, Python converts import
and
from
statements to an __import__
call. The arguments are a string for the fully qualified name of the
module to be imported, dictionaries of current global and local
namespace bindings, and the list of items to import, if the statement
is a from
.[8] If the call is from ... import *
, the
last argument is ['*']
. Then the function imports
the module represented by moduleName
and
returns the top-level package if fromlist
is empty, and the bottom-level package if it is not. The actual
binding of names is left to Python and is not performed by the
function.
You can call this function directly in your programs to
dynamically control module import—for example, to load modules one at
a time to a test suite. You can even substitute this function with
your own custom import function by rebinding
__builtin__.__import__
. In Jython, this can also be
used to import Java classes and packages.
Importing Java Classes
Jython also allows access to Java classes and packages through
the import statements. Jython is able to load classes through the
underlying Java Virtual Machine (JVM), both from the Java classpath
and from the directories in sys.path
. Conceptually,
you can think that for the purpose of loading Java classes, the
directories in sys.path
have been appended to the
classpath. This means there is no need to mark Java packages on
sys.path
with __init__.py
modules, because that would make them Python packages.
Python packages and modules take precedence over Java packages. On the other hand, Java classes in a shadowed Java package can still be loaded from the shadowing Python module.
Because it is not possible to ask the JVM directly through a Java API on which Java packages can be potentially loaded, Jython scans the available .jar files at startup and scans the candidate directories for Java packages and classes at runtime. The information obtained from .jar files is cached between sessions. All this is done to properly interpret import statements that can trigger loading of Java classes or packages. You’ll notice that if you add a .jar file to your classpath, then the next time you start Jython, you’ll see a message that the new file has been identified.
In Java, import statements are a matter of name resolution at
compilation time. We have already seen that the Python model is
different; import statements, even for packages, bind names to
concrete objects at runtime. To fit Java loading in this overall
model, Jython creates module-like, unique concrete objects (instances
of the internal org.python.core.PyJavaPackage
class) for Java packages.
A statement such as:
import java.lang
will bind java
in the current scope to the
module-like object for the Java package java.
Moreover, this namespace object will map lang
to
the java.lang package. You can then access the
String
class by referring to it as
java.lang.String
. The Java feature of having the
java.lang
classes automatically imported does not
work in Jython. However, the statement import java
will give access to any class in the standard Java library, provided
you fully qualify the name (such as
java.util.List
).
Java classes are wrapped by Jython in objects that mimic both Python-like class behavior and the original Java class behavior. A binding to this wrapper object will be set up by Jython in the module object for the package when import is requested.
All this has implications for the overhead of a statement such
as from
javapkg
import *
. This kind of statement creates wrappers
for all the classes in javapkg
and binds to
them both in the current scope and in the
javapkg
namespace. Although technically, in
this special case the wrappers do not load the Java classes
immediately, but rather, lazily as needed. However, it is not a good
idea to use from
javapkg
import *
for Java packages liberally in production
code, as one might avoid using import javapkg.*
in
Java (actually, that’s against the Sun coding guidelines in Java, as
well). The from
javapakg
import *
statement can be a useful feature, for
example, when experimenting in an interactive session.
Auto-loading through lookup
In Java you can always refer to a class by using its fully qualified name without the need for a previous import statement. Given what we have explained about Python import statements, this is not true in Jython, but Jython does offer a shortcut.
If you need to refer to
your.favorite.UsefulClass
, you do not need an
import your.favorite.UsefulClass
first. You can
simply import the top package your
with
import your
, and then you can use attribute
lookup to reach your.favorite.UsefulClass
and
also your.favorite.InvisibleClass
, and so on. For
example, you can get access to most of the Java classes by merely
using the statement import java
. Subpackages can
then be accessed using the dot operator.
import java x = java.util.Vector print x
This syntax works because in Jython, attribute lookup on a Java package object triggers loading as necessary. Jython behaves the same way for the import of Python packages, but it should be noted that this feature is not offered by CPython, so it is not portable.
Reload
A nice feature of Python is its support for dynamically
reloading modules through the reload
built-in
function:
reload(mod
)
The reload
function takes a module
mod
and executes the top-level statements
in its (possibly changed) corresponding file, reusing
mod
as global scope, after it has been
emptied. The function then returns the possibly altered and
reinitialized mod
. Reload is most
frequently used during development, when you might be continually
testing a module as you change its source. To have the Python
interactive interpreter recognize the changes, you need to reload the
module. Reload can also be useful in the context of an application
that needs to be dynamically reconfigured or upgraded at
runtime.
It should be noted that reload
does not
operate recursively on the modules imported by
mod
, and that all the bindings to old
values originally from mod
in other modules
remain in place. Code has access to the new values after reload only
if it uses attribute lookup. Specifically, a module that imported some
values from mod
through from ...
import *
will keep the unaffected values. Also, instances of
a class defined in mod
will not be affected
by the possibly changed definition. Jython also ships with the
jreload
module that offers some support for
reloading Java classes.
[7] Future statements such as from __future__
import
feature
[,...]
are
both import statements and directives to the compiler to activate
features that will become mandatory in a future release—part of
the Python strategy for gradually introducing new features. They
should appear before any conventional statements.
[8] The dictionaries are ignored by the built-in version, except
for the global bindings used to retrieve the importing package’s
__name__
. This name information is used to
implement package-relative imports, a feature whose usage is
discouraged. If you call the function directly and want to supply
a fromlist
, {}
is a
fine placeholder for both.
Get Jython Essentials now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.