
Zorp Tutorial
Version 1.0.2
8th January, 2004

1. Introduction

This tutorial serves as an introduction to setting up a Zorp based firewall
on a GNU/Linux distribution. Zorp is a GPLd proxy firewall implementation
with the following features:

* deep protocol analysis (FTP, HTTP, SSL, telnet, finger, whois, plug is included
  in the GPLd version)
* flexible decision engine, scriptable in Python
* true modularity, proxies can extend each other

It is assumed that you have general knowledge about IP networks, you know
the differences between packet filtering and proxy firewalls and at last but
not at least you know how to compile a kernel.

This tuturial covers Zorp 2.0, but the contents also apply to Zorp 1.4 and
possibly will also apply to later Zorp releases.

1.1. Further readings

You might also want to read the following HOWTO documents:

Networking Overview HOWTO

Net HOWTO

Firewall HOWTO

IPTables HOWTO

Linux 2.4 TPROXY patch documentation
(http://www.balabit.hu/en/downloads/tproxy/README.txt)

HOWTO documents are made available by the Linux Documentation Project at
http://www.linuxdoc.org/

2. Installing the operating system

Zorp currently requires a Linux based operating system. Linux is currently
required as transparent proxying needs a couple of kernel extensions which
are available for Linux only. As this support will be available on other
operating systems, Zorp will easily be ported over.

So pick your favourite Linux distribution and install a bare-bones system
with only the absolutely necessary components you will need. Zorp is being
developed on Debian GNU/Linux so using Debian is the easiest, though
compiling it on other distributions is also possible.

If you have a Debian GNU/Linux system, you can use the following apt
line for the binary packages, and skip the compiling below:
deb http://www.balabit.hu/downloads/zorp/zorp-os/debian woody main zorp-gpl zorp-common 

You will need to compile a couple of programs and the kernel itself, but a
compiler is something that should not be installed on the firewall.
Therefore it is best to compile stuff on a separate host and copy only
the binaries to the firewall.

Be sure to install run-time dependencies so Zorp will find them later:

* glib 2.0 (Zorp 1.4 requires a specific glib version released as 1.3.1,
  Zorp 2 uses glib 2)
* python 2.1 or newer with extension class support, version 2.3 is recommended 
  (Zorp 1.4 is using python 1.5.2 but should compile with python 2.1 too).
  (If you have multiple versions of python development environment, make
  sure that the default version of python and python-extclass is the same!)
* libcap 1.10 (Zorp optionally manages its own capabilities, dropping
  unneeded caps if possible)
* openssl 0.9.6g or later. At the time of writing this, 0.9.7c is
  the version which has no known security problems.

Zorp works with either a Linux 2.2 or Linux 2.4 kernel, but neither of
those is usually compiled as required in distributions. Thus you will need
to compile your own kernel, see the next section.

3. Compiling your own kernel

Zorp is a transparent proxy firewall, thus it needs a couple of kernel
extensions. Either Linux 2.2 or Linux 2.4 will work but the transparent
proxy features are somewhat different.

In addition to transparent proxying you might want to add a security patch
like openwall or grsecurity as the firewall is a security sensitive device.

3.1. Transparent proxying in Linux 2.2

Linux 2.2 features built in transparent proxy capabilities, but you need to
enable them in your kernel configuration. It can be found under the
"Networking options" menu, the option is called "Transparent proxy support".
This option requires that you turn on "IP: firewalling" option as well.

You might also want to enable policy routing and other advanced IP features
as they are often needed on firewals.

3.2. Transparent proxying in Linux 2.4

The transparent proxy support that was present in Linux 2.2 was removed from
Linux 2.4 when iptables was introduced. We implemented a patch against Linux
2.4 that adds the required features so Zorp tightly integrates into
NetFilter/iptables.

You can download this patch from http://www.balabit.hu/en/downloads/tproxy/

After you add this patch, enable iptables, iptables connection tracking,
iptables nat and iptables transparent proxying options in your kernel
configuration. The target 'TPROXY' and the match 'tproxy' is especially
important, other iptables modules should be compiled as necessary.

Please note that as of now Zorp detects the presence of transparent proxy
support on a Linux 2.4 kernel by trying to load the iptable_tproxy module.
Thus it is required to compile the tproxy table as module.

Some transparent proxy functionalities will work without this module,
but you need the kernel patch to have a fully functional Zorp.

4. Compiling Zorp

It is generally good not to have a compiler on your firewall host, so either
compile the package on a different host, or remove gcc and development files
from your firewall after installation.

In Zorp 2.0 the core zorp tarball was split into two: a library called
zorplibll containing the low level functions and zorp itself.

You will first need to compile libzorpll:

tar xvfz zorplibll-2.0.14.2.tar.gz
cd zorplibll-2.0.14.2
./configure 
make
sudo make install (assuming sudo is the command to make you root)

Make sure that you copy the resulting shared library to your firewall host.
This can be accomplished by using the DESTDIR make variable:

sudo make DESTDIR=/tmp/staging install

This command will use /tmp/staging as a root directory while copying files,
thus /usr/lib/libzorpll.so is copied to /tmp/staging/usr/lib/libzorpll.so.

At the end of the compilation you can simply copy the contents of your
staging directory to the firewall host.

Alternatively you can compile zorplibll to a Debian package by entering
"dpkg-buildpackage" in the extracted source directory. The build process
results in two Debian packages: libzorpll_<version>_i386.deb and
libzorpll-dev_<version>_i386.deb assuming you are compiling on an Intel
architecture. Install both debs on your compiling host, and libzorpll on
your firewall host, as development files are needed only for compilation.

If libzorpll was successfully compiled you can go on to compile Zorp itself:

tar xvfz zorp-2.0pre30.tar.gz
cd zorp-2.0pre30
./configure
make
sudo make DESTDIR=/tmp/staging install

This will compile zorp and copy the resulting binaries to /tmp/staging. The
configure script checks your system whether it finds the required build
dependencies. If one of the dependencies are not met, try to install the
missing package. In addition to what is described in section 2 as Zorp
requirements, you will also need libzorpll.

It might be possible that the configure script does not find some of the
required libraries even if they are installed. The biggest problem usually
is the python development files. Zorp looks for the shared library version
of Python and it is not always provided by distributions. In this case you
might try to use the '--with-python-headers' and '--with-python-libraries'
configure options or ask for help on the mailing list.

Of course the trick for building Debian packages is possible again by
entering "dpkg-buildpackage" in the extracted source directory. It will
result in the following debs to be created:

* zorp: the main program
* libzorp2: the library functions needed to run zorp and its modules
* libzorp2-dev: development files needed to compile zorp modules
* zorp-modules: proxy modules
* zorp-doc: documentation files

From these only zorp, libzorp2 and zorp-modules is required to be installed
on your firewall host.

5. Starting up Zorp

Assuming the build process was successful and you copied the necessary files
to your firewall host, you can now start configuring Zorp itself.

5.1. Sample network topology

In the following sections I am trying to guide you through Zorp
configuration by using a simple example. This example network has three
distrinct security zones:

1) an intranet where protected client computers reside
   address range: 192.168.0.0/24
   firewall IP: 192.168.0.254
   
   * clients are permitted to use HTTP, HTTPS and FTP services destined to any
     other zones (internet, DMZ)
   * clients are permitted to use SMTP, DNS and NTP installed on the firewall

2) a demilitarized zone or DMZ on another interface where public access
   services are provided from (the web server of the company)
   address range: 10.0.0.0/24
   firewall IP: 10.0.0.254
   web server IP: 10.0.0.1
   
   * clients are not permitted to use any service outbound
   * clients are permitted to use SMTP, DNS and NTP installed on the firewall

3) the internet itself with a single, static IP address
   firewall IP: 11.12.13.14
   
   * clients are permitted to use HTTP serviced in the DMZ
   * clients are permitted to use SMTP installed on the firewall
   * the firewall must communicate with the NTP server on the internet and
     also to post DNS requests to a single forwarder
   
5.2. Architecture

Zorp is a proxy based firewall which means that it has several protocol
implementations which each take care about mediating a given protocol
between hosts on its different interfaces.

Zorp based firewalls are usually integrated into the network topology as
routers, this means that they have an IP address in all their subnets, and
hosts on different subnets use the firewall as their gateway to the outside
world.

Although proxy based, Zorp uses a packet filter to preprocess the packet
stream, and also to provide transparency.

A TCP session is established in the following way:

1) the client initiates a connection by sending a SYN packet destined to the
   server

2) the firewall behaves as a router between the client and the server,
   receives the SYN packet on one of its interfaces and consults the packet
   filter

3) the packet filter rulebase is checked whether the given packet is
   permitted

4) if the given connection is to be processed by a proxy, then the packet
   filter rulebase contains a REDIRECT (ipchains) or TPROXY (iptables)
   target. Both REDIRECT and TPROXY requires a port parameter which tells
   the local port of the firewall host where the proxy is listening.
   
   It is also perfectly possible although strongly discouraged to bypass the
   proxies and forward packets directly, you only need to use the ACCEPT
   target instead of TPROXY.

5) Zorp accepts the connection, checks its own access control rules and
   starts the appropriate proxy 

6) the proxy connects to the server on its own as needed (the server side
   connection is not necessarily established immediately)

7) the proxy mediates protocol requests and responses between the
   communicating hosts while analyzing the ongoing stream
   
Of course the remaining packets of the TCP session after the initial SYN
must also be allowed by the packet filter.

5.3. Configuring network interfaces

As I stated earlier a Zorp based firewall fulfills the role of an IP router
from its neighbour perspective. This means that all its interfaces must be
configured to have an IP address in the subnet of the connecting network.

The firewall has three interfaces:

* eth0 as the intranet interface with IP 192.168.0.254/24
* eth1 as the DMZ interface with IP 10.0.0.254/24
* eth2 as the internet interface with IP 11.12.13.14

NOTE: that the transparent proxy patch for Linux 2.4 requires a local
address which does not collide with any local address in your network. The
best way to provide one is to configure a dummy0 interface with a dummy IP
address in the RFC1918 reserved range. You will need to pass this IP to Zorp
using the --autobind-ip command line option. See the TPROXY patch
documentation for more information.

(http://www.balabit.hu/en/downloads/tproxy/README.txt)

5.4. Configuring the packet filter

To configure the packet filter we first need to establish a couple of rules
we will be adhering to, as the packet filter ruleset can become quite
complicated. First of all we name all the neighbouring networks. This name
should be short and easy to remember. These names will be used when naming
chains.

Long name	| Short name
----------------------------
Intranet	| intra
Internet	| inter
DMZ		| dmz

The iptables subsystem defines several tables each with its own set of
chains and rules. We will be focusing on two tables now: the filter table
where simple packet filtering is done, and the tproxy table where we
are redirecting sessions to our proxies.

5.4.1. Storing the ruleset

Some people like storing their ruleset as a shell script which invokes the
necessary iptables commands. As I don't like mixing executable code and data
we use the format defined by iptables-save & iptables-restore.

As raw iptables-restore format has no macro possibility we created a
frontend named iptables-utils where a couple of scripts help the creation
and maintenance of a packet filter rulesets. Here's an outline of the
iptables-utils approach:

  * the following files are used by iptables-utils:
    - iptables.conf.in: contains our ruleset before processing, this is a
      user supplied file, we are going to edit this with our favourite
      editor
    - iptables.conf.var: contains our macro definitions, it might contain
      a series of C like #define statements. I say C like because macro
      substition differs from cpp.
    - iptables.conf.new: when processing conf.in & conf.var our new
      ruleset will be generated here
    - iptables.conf: is our current ruleset, iptables.conf.new is copied
      here if found to be correct
  * the ruleset is maintained the following way:
    - you edit either iptables.conf.in or iptables.conf.var
    - you process your modifications by the command 'iptables-gen', this
      will result in a iptables.conf.new to be generated
    - you test your new ruleset by invoking 'iptables-test', this script
      loads the new ruleset, waits a couple of seconds and reloads the
      old ruleset, if you made a mistake you are still not closed out
      from the system
    - if the new ruleset is ok, you invoke 'iptables-commit' which
      overwrites iptables.conf with iptables.conf.new and loads the
      ruleset

Using iptables-utils was absolutely beneficial in the long term as the
number of system-closeouts dramatically decreased, which is good if you
are hundreds of miles of away from the firewall.

Macro expansion is not simple substition, if a macro contains several
words the rule where the macro is referenced is copied, at the end you
get a new rule for each word in your macro. For instance:

     iptables.conf.var:

         #define SSH_PERMITTED 1.2.3.4 1.2.3.5

     iptables.conf.in:

         -A INPUT -p tcp -m tcp -s SSH_PERMITTED --dport 22 -j ACCEPT

You will get two rules the first with 1.2.3.4 substituted, the second
with 1.2.3.5 substituted.

iptables-utils is available from:

http://apt.balabit.com/zorp-gpl-os/pool/i/iptables-utils/

5.4.2. Naming the chains

In addition to the standard chains provided by iptables (INPUT, OUTPUT
etc) we will create separate chains for each security zone. Each security
zone will have two chains:

  * a chain which contains rules for traffic which passes the firewall
  * a chain which contains rules for traffic destined to the firewall

The first one will be prefixed by PR which stands for PRoxy rules, the
second one will be prefixed by LO which stands for LOcal rules. Proxy
rules will be placed into the 'tproxy' table, local rules will be placed
into the 'filter' table. If we assume that all traffic goes through
proxies we won't need NAT nor mangle rules. (of course we can add further
finetuning to our rulebase, like limiting the number of SYNs etc)

5.4.3. Jumping to our chains

We have two set of chains for each security zone, LOxxx chains are
processed in the filter table, INPUT chain. PRxxx chains are processed in
the tproxy table, PREROUTING/OUTPUT chain.

Our filter/INPUT chain will be something like this:

     ...
     -A INPUT -m tproxy -j ACCEPT
     -A INPUT -i <intranet iface> -j LOintra
     -A INPUT -i <internet iface> -j LOinter
     -A INPUT -i <dmz iface>      -j LOdmz
     -A INPUT -j DROP

This means that all permitted traffic must be enabled in their specific
chain or will be dropped on the INPUT chain. Of course logging dropped
packets would be a good idea. It is important to mention that our
FORWARD chain should contain a single DROP rule as we don't forward
packets. Each LOxxx chain should look like this:

     -A LOintra -p tcp --dport 22 -j ACCEPT
     ... permit each service ... 
     -A LOintra -j DROP

Of course our LOxxx chains might be different for each zone, as we might
permit SSH access from the intranet only.

Note the '-m tproxy' rule at the front of other rules, it allows all
traffic redirected by any TPROXY feature to pass the filter table. (this
includes TPROXY redirections, and foreign-bound traffic)

We took care about local services provided by the firewall, let's
make our proxying rules now.

Our tproxy/PREROUTING chain will be something like this:

     -A PREROUTING -i <intranet iface> -d ! <fw intranet IP> -j PRintra
     -A PREROUTING -i <internet iface> -d ! <fw internet IP> -j PRinter
     -A PREROUTING -i <dmz iface>      -d ! <fw dmz IP>      -j PRdmz

A PRxxx chain should something like this:

     -A PRintra -d 0/0 --dport 80 -j TPROXY --on-port 50080
     ... repeat the above rule for each service ...

At the end of a PRxxx chain no DROP should be performed, as unmodified
sessions will be stopped when the filter table is evaluated. The port
number specified by TPROXY rules should match the port number where the
transparent proxy (Zorp in our example) will be bound.

Here is a complete iptables configuration for our sample network:

iptables.conf.var:

#define IFintra   eth0
#define NETintra  192.168.0.0/24

#define IFinter   eth1

#define IFdmz     eth2
#define NETdmz    10.0.0.0/24

#define NTP_SERVERS 1.2.3.4 1.2.3.5

#define DNS_SERVERS 2.3.4.5

iptables.conf.in:

*tproxy
:PREROUTING ACCEPT
:OUTPUT ACCEPT
:PRintra -
:PRinter -
:PRdmz -
-A PREROUTING -i IFintra -j PRintra
-A PREROUTING -i IFinter -j PRinter
-A PREROUTING -i IFdmz   -j PRdmz
// PRintra chain
-A PRintra -p tcp --dport 80 -j TPROXY --on-port 50080
-A PRintra -p tcp --dport 443 -j TPROXY --on-port 50443
-A PRintra -p tcp --dport 21 -j TPROXY --on-port 50021
// PRinter chain
-A PRinter -p tcp --dport 80 -j TPROXY --on-port 50080
// PRdmz chain
// no services permitted
COMMIT
*filter
:INPUT DENY
:FORWARD DENY
:OUTPUT ACCEPT
:noise -
:spoof -
:spoofdrop -
:LOintra -
:LOinter -
:LOdmz -
-A INPUT -j noise
-A INPUT -j spoof
// permit all traffic initiated by transparent proxies
-A INPUT -m tproxy  -j ACCEPT
//
// permit all TCP traffic initiated by local processes, or allowed by rules
// below, we don't trust the state match for UDP traffic, they will be handled
// by individual rules below.
//
-A INPUT -p tcp -m state --state ESTABLISHED,RELATED -j ACCEPT
// permit all loopback traffic
-A INPUT -i lo -j ACCEPT
-A INPUT -i IFintra -j LOintra
-A INPUT -i IFinter -j LOinter
-A INPUT -i IFdmz   -j LOdmz
-A INPUT -j DROP
-A FORWARD -j LOG --log-prefix "FORWARD DROP: "
-A FORWARD -j DROP
// LOintra
-A LOintra -p udp --dport 53 -j ACCEPT
-A LOintra -p udp --dport 123 -j ACCEPT
-A LOintra -p tcp --syn --dport 25 -j ACCEPT
-A LOintra -j LOG --log-prefix "LOintra DROP: "
-A LOintra -j DROP
// LOinter
// permit DNS replies, bind is configured to send out DNS packets from this
// port. We could also use the state match in our INPUT chain.
-A LOinter -p udp -s DNS_SERVERS --dport 53000 -j ACCEPT
-A LOinter -p udp -s NTP_SERVERS --dport 123 -j ACCEPT
-A LOinter -p tcp --syn --dport 25 -j ACCEPT
-A LOinter -j LOG --log-prefix "LOinter DROP: "
-A LOinter -j DROP
// LOdmz
-A LOdmz -p udp --dport 53 -j ACCEPT
-A LOdmz -p udp --dport 123 -j ACCEPT
-A LOdmz -p tcp --syn --dport 25 -j ACCEPT
-A LOdmz -j LOG --log-prefix "LOdmz DROP: "
-A LOdmz -j DROP
//
// noise chain, should drop all packets which need not be logged,
// otherwise it should return to the main ruleset
//
-A noise -p udp --dport 137:139 -j DROP
-A noise -j RETURN
//
// spoof chain, should drop all packets with spoofed source address
// otherwise it should return to the main ruleset
//
-A spoof -i lo -j RETURN
-A spoof ! -i lo -s 127.0.0.0/8 -j spoofdrop
-A spoof -i IFintra ! -s NETintra -j spoofdrop
-A spoof ! -i IFintra -s NETintra -j spoofdrop
-A spoof -i IFdmz ! -s NETdmz -j spoofdrop
-A spoof ! -i IFdmz -s NETdmz -j spoofdrop
-A spoof -j RETURN
//
-A spoofdrop -j LOG --log-prefix "Spoofed packet: "
-A spoofdrop -j DROP
COMMIT

5.5. Configuring Zorp

This section focuses on Zorp configuration.

5.5.1. Zorp & Python

The configuration of Zorp is Python based, in fact the configuration file is
a Python module in itself. This does not mean however that the administrator
would have to learn Python and does neither mean that Zorp itself is
written in Python.

The use of Python is twofold: 

1) it is used as a glue to connect Zorp components together

   These parts are implemented by us and live as Python modules in the
   directory '/usr/share/zorp/pylib'.

2) it is used to describe the configuration and to customize proxy behaviour

   This part is written by the administrator, but an effort was made to make
   the configuration file look like configuration and _NOT_ a program.  A
   standard policy without tricks is easier to write than a 'netperm-table'
   (of TIS fwtk fame).
   
Though the configuration file may not seem like a Python module, it is
important to know it is parsed as one. So the following syntactical
requirements of Python apply:

* Indentation is important as it marks the beginning of a block, similar to
  what braces do in C/C++/Java. This means that the way you indent blocks
  must be consistent for that given block. For example this is correct:
  
    if self.request_url == 'http://www.balabit.hu/':
        print('debug message')
        return HTTP_REQ_ACCEPT
    return HTTP_REQ_REJECT
    
  This is not:
  
    if self.request_url == 'http://www.balabit.hu/':
          print('debug message')
        return HTTP_REQ_ACCEPT
    return HTTP_REQ_REJECT
    
  The code snippet above could be expressed in a C-like language like this:

    if (self.request_url == 'http://www.balabit.hu/')
      {
        print('debug message');
        return HTTP_REQ_ACCEPT;
      }
    return HTTP_REQ_REJECT;

5.5.2. Zorp components

To start configuring Zorp you will need to know the following Zorp components:

* Instance: it is possible to start several instances of Zorp just like you
  can start many instances of any program. Each zorp instance has a name and
  its own set of services to provide. Several instances can use the same
  configuration file, though each will process only the relevant parts.

* Zone: A zone encapsulates a part of the neighbouring network. Each client
  and server is a member of exactly one zone, membership based on IP
  address. Zorp uses a zone based access control which means that the
  permitted set of services available to a client/server combination is
  assigned to the zones those clients and services reside in.

* Service: a service encapsulates a proxy and associated parameters. Each
  service is identified by a unique name which is used for logging and
  access control purposes.
  
* Listener: a listener is an object which listens for connection on a given
  port and for each accepted TCP session it is capable of starting service
  instances. Listeners are the input point of Zorp, usually the packet
  filter redirects TCP sessions to one of the ports where a Zorp Listener is
  waiting.

* Router: a router in Zorp decides the destination of a given session. Each
  service has an associated Router but as it defaults to TransparentRouter
  it does not have to be explicitly given.

* Chainer: a chainer is used even less often than a Router, it is also
  associated with services and their task is to establish the server side
  connections of proxies.
  
5.5.3. The simplest Zorp configuration

Zorp uses two files to store its configuration. The file named
'instances.conf' contains the list of Zorp instances to be run. Its content
is processed by the 'zorpctl' script. The other file, usually named
'policy.py' stores the policy (aka ruleset) of one or more Zorp instances.

The following listing is a complete, working Zorp policy file with a single
instance named 'intra', and a single zone named 'inter' which encapsulates
the whole IPv4 address space. 

from Zorp.Core import *

InetZone('inter', '0.0.0.0/0')

def intra():
	pass

A few things to notice:

* this file is a Python module, therefore the import statement on the first
  line, it imports all core Zorp symbols that are required even for basic
  operation.
  
* the name of our zone here matches the name we used while writing our
  packet filter ruleset, this is not a requirement, it is just good
  practice to make your firewall ruleset cleaner.

* the instance named 'intra' is represented as a Python function with no
  arguments. This is currently empty, thus the Python NOP called 'pass' as
  the function body (Python requires at least one statement in every block).
  You will see how this can be augmented with Service and Listener
  definitions so 'pass' will not be needed.

Zorp instances can be started and stopped by the 'zorpctl' script. 'zorpctl
start' starts all known instances, 'zorpctl stop' stops them. 'zorpctl'
works by parsing the '${prefix}/etc/zorp/instances.conf'. Each instance name
in the instances.conf file must have a corresponding instance definition in
the policy file to work correctly. A sample instances.conf file will be
shown in the following paragraphs.

This is simple enough, isn't it? Now let's augment it with the definitions
of our zones, and let's create three instances for each of our zones:

from Zorp.Core import *

InetZone('intra', '192.168.0.0/24')
InetZone('dmz', '10.0.0.0/24')
InetZone('inter', '0.0.0.0/0')

def intra():
	pass

def dmz():
	pass

def inter():
	pass

You will need the following instances.conf(5) file to start your zorp
instances using zorpctl:

intra -v3 -p /etc/zorp/policy.py --autobind-ip 192.168.0.1
inter -v3 -p /etc/zorp/policy.py --autobind-ip 192.168.0.1
dmz -v3 -p /etc/zorp/policy.py --autobind-ip 192.168.0.1

The 'instances.conf' file specifies zorp startup parameters to use when the
given instance is started. Consult zorp(8) manpage or run
'/usr/lib/zorp/zorp --help' for more details.

One important point to make is the 'autobind-ip' argument in the example
above, TPROXY requires a local, non-routeable IP address to make
transparency possible. See section 5.4 and the TPROXY README file for more
details.

5.5.4. Adding our services

Although our Zorp process is running by entering the configuration in the
previous section, it would do nothing really useful. To do anything useful
we have to define services, and listeners.

from Zorp.Core import *
from Zorp.Http import *

InetZone('intra', '192.168.0.0/24',
	 outbound_services=['intra_HTTP'])
InetZone('dmz', '10.0.0.0/24',
	 inbound_services=['intra_HTTP'])
InetZone('inter', '0.0.0.0/0',
	 inbound_services=['intra_HTTP'])

def intra():
	Service('intra_HTTP', HttpProxy)
	Listener(SockAddrInet('192.168.0.254', 50080), 'intra_HTTP')

def dmz():
	pass

def inter():
	pass

A few things to notice:

* we have added a new import line to import all symbols the HTTP module
  provides, the most important being HttpProxy which we used in our service
  definition.
  
* we have added access control information to our zones, the 'intra_HTTP'
  service is permitted to be used outbound from the zone 'intra', and is
  permitted to target servers in the zones 'dmz' and 'inter'.
  
* we have removed the 'pass' statement from our 'intra' function and added
  two statements instead: a service definition and a listener definition

* The service definition names our new service 'intra_HTTP' which is using
  the proxy named HttpProxy, further options could be specified here as you
  will see in coming sections.

* The listener opens the port 192.168.0.254:50080, our packet filter rules
  redirect all transparent, port 80 traffic to this port

* The listener starts the service named in its second argument.

* Service names are divided into three parts separated by an underscore:
  source zone, protocol, destination zone. If the service is transparent and
  the destination is not known, the destination zone is omitted from the
  service name. This naming scheme is not required by Zorp, though the use
  of some kind of scheme makes firewall administration easier.

Here is a complete listing of the simple policy I presented in section 5.1.

from Zorp.Core import *
from Zorp.Plug import *
from Zorp.Http import *
from Zorp.Ftp import *

InetZone('intra', '192.168.0.0/24',
	 outbound_services=['intra_HTTP', 'intra_HTTPS', 'intra_FTP'])
InetZone('dmz', '10.0.0.0/24',
	 inbound_services=['intra_HTTP', 'inter_HTTP_dmz'])
InetZone('inter', '0.0.0.0/0',
	 outbound_services=['inter_HTTP_dmz'],
	 inbound_services=['intra_HTTP', 'intra_HTTPS', 'intra_FTP'])

def intra():
	Service('intra_HTTP', HttpProxy)
	Listener(SockAddrInet('192.168.0.254', 50080), 'intra_HTTP')

	Service('intra_HTTPS', PlugProxy)
	Listener(SockAddrInet('192.168.0.254', 50443), 'intra_HTTPS')

	Service('intra_FTP', FtpProxy)
	Listener(SockAddrInet('192.168.0.254', 50021), 'intra_FTP')

def dmz():
	pass

def inter():
	Service('inter_HTTP_dmz', HttpProxy,
		router=DirectedRouter(SockAddrInet('10.0.0.1', 80)))
	Listener(SockAddrInet('11.12.13.14', 50080), 'inter_HTTP_dmz')

A few things to notice:

* we have added two new import statements to import the symbols provided by
  the Plug and Ftp proxies.
* we used a PlugProxy for HTTPS purposes, we could also have used an SSL
  proxy instead
* our 'inter_HTTP_dmz' service has a fixed destination, this is accomplished by
  using an explicit router specification: we use DirectedRouter() to specify
  the destination server. All other services use the default router named
  TransparentRouter() which means they connect to the original destination
  of the client.
* the 'inter_HTTP_dmz' service has a fully qualified name, since we
  know the destination zone as - unlike other services - it has a fixed,
  predefined destination: it connects to the webserver in the DMZ.

5.5.5. Customizing proxies

In the previous section we implemented a firewall policy in about 30 lines.
Although our example was quite simple there are real world firewalls with
policies not more difficult than our sample. 

Until we did not really use the fact that we have a programming language in
our hands. The configuration above is simple, but it doesn't show the
potential Zorp provides.

The second argument of a Service statement is a proxy class, the fact it is
a class makes easy customization possible. As customization requires a bit
more knowledge about Python, we provided a good number of predefined proxy
classes. As an example Ftp has a couple of predefined variations: 

* FtpProxyRO, 
  an FTP proxy which permits downloading only
  
* FtpProxyAnonRO, 
  an FTP proxy which permits downloading only, and only the anonymous user
  is permitted

If you cannot find the necessary customization, then - and only then - do
you need to derive your own class. The next listing shows how.

class HttpProxyAnonimize(HttpProxy):
	def config(self):
		HttpProxy.config(self)
		# customization statements

The listing above shows a class definition in Python, our new class has the
name 'HttpProxyAnonimize', it is derived from HttpProxy and has defined the
method named 'config'. The 'config' method calls the 'config' method in our
superclass to also derive default configuration settings. You can take the
above code snippet as a skeleton for your future customizations. Changing
the parent class to 'FtpProxy' and making our 'config' method to call the
config method from 'FtpProxy' would create a customized Ftp proxy class.

What can you put in your 'config()' method? Anything that the proxy
provides. Our HTTP proxy has over 30 settings and there are complex
filtering rules that you can set. Documentation on each attribute a given
proxy provides can be found in the Python module for that proxy. This means
that the documentation for our Http proxy can be found in
/usr/share/zorp/pylib/Zorp/Http.py. The documentation usually also contains
examples.

Now, let us create an Http proxy that hides the browser type:

class HttpProxyAnonimize(HttpProxy):
	def config(self):
        	HttpProxy.config(self)
                self.request_headers["User-Agent"] = (HTTP_HDR_CHANGE_VALUE, "Mozilla 5.0")

That's it, now you can refer to HttpProxyAnonimize from your service
definitions like this:

def intra():
	Service('intra_HTTP', HttpProxyAnonimize)
	Listener(SockAddrInet('192.168.0.254', 50080), 'intra_HTTP')
	
	...

A little bit more complex example shows how to remove Referer information.
This is a bit more difficult as a lot of sides relies on Referer being
correct, some of them simply stops working if the referer does not point to
them. Thus simply changing the referer value to something fixed will not work.

We work around this by setting the referer field to the currently request
URL. Let us extend our previous HttpProxyAnonimize with this feature:

class HttpProxyAnonimize(HttpProxy):
	def config(self):
        	HttpProxy.config(self)
                self.request_headers["User-Agent"] = (HTTP_HDR_CHANGE_VALUE, "Mozilla 5.0")
                self.request_headers["Referer"] = (HTTP_HDR_POLICY, self.rewriteReferer)

	def rewriteReferer(self, name, value):
		self.current_header_value = self.request_url
		return HTTP_HDR_ACCEPT

As you can see we have defined a Python function to perform the Referer
change. Of course we can do even more complex things by extending the proxy
functionality here and there, however this is not the scope of this
document.

5.6. What is modularity?

As I have already written in the introduction, Zorp has a modular
architecture, proxies can extend each other. But what does this mean
exactly?

Each proxy in Zorp is generalized in a way that it is independent of the
communication mechanism used towards its client or server peers. This means
that proxies don't really care whether they communicate using a real TCP
connection or a UNIX domain socket. This comes handy when we want to analyze
something that has multiple protocol levels:

For example simple HTTP uses TCP as its transport protocol but does not have
authentication or integrity protection on its own. If you add SSL to the
picture you get HTTPS: HTTP running on SSL, which in turn runs on TCP. As
Zorp has an SSL capable proxy (implementing an MITM in fact), we can
construct an HTTPS proxy out of our ordinary HTTP and SSL proxies. Here is
an example:

class HttpsProxy(PsslProxy):

	class EmbeddedHttpProxy(HttpProxy):
		def config(self):
			HttpProxy.config(self)
			self.request_header["User-Agent"] = (HTTP_HDR_DROP)

	def config(self):
		PsslProxy.config(self)
		self.client_need_ssl = TRUE
		self.client_key_file = '/etc/zorp/https.key'
		self.client_cert_file = '/etc/zorp/https.crt'
		self.client_verify_type = PSSL_VERIFY_NONE
		self.server_need_ssl = TRUE
		self.server_ca_directory = '/etc/zorp/https_trusted_ca.crt'
		self.server_verify_type = PSSL_VERIFY_REQUIRED_TRUSTED
		
		# here we specify that decrypted protocol stream is
		# to be passed to an instance of EmbeddedHttpProxy above
		self.stack_proxy = EmbeddedHttpProxy


A few things to notice:
* Python syntax is fully recursive, this comes well when defining
  "EmbeddedHttpProxy", this way we can emphasize that it is embedded into
  "HttpsProxy". However this is syntactic sugar only, nothing requires you
  to define embedded proxy classes within other classes, you can use your
  proven Http filtering class inside SSL.
* The most important part is the line with the comment, we specify that
  as soon as the SSL handshakes are completed, an EmbeddedHttpProxy should
  be started with the decrypted protocol streams. The stacked proxy can do
  anything to modify protocol contents.
* The PSSL proxy allows encryption to be enabled or disabled on both of its
  client or server side, thus it can be used to wrap or unwrap protocol streams
  into/out of SSL.
* Of course the proxy above can be fully transparent.
* Not all proxies provide embedded protocol streams that you can attach
  other proxies to (PsslProxy and PlugProxy does, FingerProxy does not)


6. Where to look for further information

Zorp Professional Getting Started - 
  available on our website in PostScript form, this book is shipped with
  every Professional Zorp copy

Proxy specific documentation -
  it is available in inline Python docstrings of each proxy module

Python layer -
  a couple of Zorp objects are implemented in pure Python, each of these
  classes is documented in the appropriate Python module.

mailing list -
  last but not at least the mailing list and its archive is a useful
  resource. we usually respond to questions quite fast.
  The mailing list is at zorp@lists.balabit.hu, you can subscribe at
  http://lists.balabit.hu/mailman/listinfo/zorp, or through its email
  interface.
  To obtain instructions on valid Mailman email commands, send email to
  <zorp-request@lists.balabit.hu> with the word "help" in the subject
  line or in the body of the message.

