Apache2 Authentication against Active Directory
Apache 2 secure reverse proxy running on Debian Linux and authenticating against Windows 2003 Server Active Directory using secure LDAP via mod_auth_pam and pam_ldap.
Debian 3.1 (Sarge)
Apache2 Installation Click here.In the case of this implementation, there is a single unsecured home page in the main server, and then an IP based secure, authenticated virtual server proxying each origin server. There is one file created to contain the specific configuration for each of the proxying servers within /etc/apache2/sites-enabled. It is in these files where we configure authentication.
Providing mod_auth_pam is loaded (look in /etc/apache2/mods-enabled) and has not been disabled elsewhere in the configuration, and that there is no competing authentication module loaded, all that is required is something of the form:
AuthName "Secure reverse proxy"
Allow from all
This type of authentication didn't work reliably under apache2-mpm-worker but did under apache2-mpm-prefork.Occasional missing requisite items (eg images and style sheets) in Internet Explorer and segmentation faults reported in Apache's error log. Using other browsers, there were still errors logged, but no missing items.
Apache2's PAM configuration will be held in /etc/pam.d/apache2. This contains:
auth required pam_ldap.so
auth required pam_caseless_listfile.so onerr=fail item=user sense=deny file=/path
account required pam_permit.so
This first line indicates that authentication against pam_ldap is required. The second line uses a (slightly) custom module to block certain accounts. The thrid line prevents users needing accounts on the system running Apache.The (slightly) custom module is worthy of further discussion. We use a number of accounts for system registration purposes, and their usernames and passwords are widely known. We therefore need to prevent them from being used for authentication purposes here. There is a standard module, pam_listfile, that allows us to do this, but it uses case sensitive username matching. We wanted case insentitive matching.
A quick and dirty fix requires a one line change and a recompile. The diff for pam_listfile.c is:
/* adapted for caseless comparison of the content of the list file.
* This involves changing strcmp(aline,sitemp) to strcasecmp(aline,sitemp)
retval = strcmp(aline,citemp);
retval = strcasecmp(aline,citemp);
We made the change slightly less dirty by creating a new module based on the original.
The configuration for pam_ldap is held in /etc/pam_ldap.conf. It's useful while testing to set up
debugging (but don't forget to turn it off when the system goes into service). Therefore add lines something like:
debug 1 logdir /tmp
Leaving referrals turned on significantly slowed down authentication, and since they are not needed in our application, add the line:
Tell pam_ldap which domain(s) or host(s) to send requests to with the host directive. There are multiple Domain Controllers in our Windows domain and each has an LDAP server. We wanted to be able to select an arbitrary one by specifying the domain name rather than giving one or more host names. However, using SLDAP we found that the certificate presented by each server had contained the server's host name, and the mismatch between the domain name and host name upset OpenLDAP. We tried setting tls_checkpeer no to turn off certificate checking, but this didn't
seem to make any difference. In the end, we listed host names of the Domain Controllers (space separated). The result should be that requests are directed to the first one in the list unless or until there is a time out, when the next one will be tried. The resulting host directive looks something like:
host dc1.our.domain.name.tld dc2.our.domain.tld dc3.our.domain.name.tld
The search base is set with the base directive. We have user records potentially under any organisational unit,so the search has to start at the top of our tree.
Anonymous queries are not normally accepted by AD, so we need to set a distinguished name and password to bind as. We couldn't make this work with normal way of giving a distinguished name, so we used Microsoft's User Principal Name (UPN) format:
binddn [email protected]
The best port to use appears to be the secure port of the Global Catalog:
Since user records could be anywhere, the search scope needs to be subtree:
scope sub The object class were interested in is User, so set a filter:
The best login attribute to use in AD is sAMAccountName:
We're not providing for password updating, but for potential future use, set the pam_password
Turn on SSL:
Regardless of whether we set tls_checkpeer or not, we couldn't get this working without having
access to the server certificate authority certificate.
Getting the right certificate in the right format can be a lengthy process. Any Windows machine attached to the domain can get this certificate. From the Control Panel select Internet Options, then Content, Certificates and click on the Trusted Root Certification Authorities tab. In that list there should be one or more certificates issued to and issued by the name of the Windows domain or the organisation.
Select the one with the latest expiry date and click Export..., then Next >. Select Base-64 encoded X.509 (.CER),the click Next >. Browse to temporarily save it somewhere sensible, then make the obvious clicks to finish.
This file needs to be transferred to the system running Apache 2 and placed (say) under /usr/share/ca-certificates.
Tell pam_ldap to look at it using something of the form:
Finally, disable SASL security so we can work with AD:
We use a mod_python filter to perform crude but effective regular expression modification of URLs
in HTML and CSS files. The result is that files are rewritten as they come through the reverse proxy.
More complex processing could be used at the cost of performance. The performance hit could probably be offset by caching the transformed files.
The filter looks like this:
from mod_python import apache
replacements = (
# AddOutputFilter didn't seem to work for proxied requests,
# so use SetOutputilter and have all types come through here
if filter.req.content_type != 'text/html' and filter.req.content_type != 'text/css':
if not hasattr(filter.req,'temp_doc'): # the start
filter.req.temp_doc =  # create new attribute to hold document
# If content-length ended up wrong, Gecko browsers truncated data, so
if "Content-Length" in filter.req.headers_out:
temp_doc = filter.req.temp_doc
s = filter.read()
while s: # could get '' at any point, but only get None at end
s = filter.read()
if s is None: # the end
temp_doc = ''.join(temp_doc)
for (regex,new) in replacements:
temp_doc = regex.sub(new,temp_doc)
#filter.req.set_content_length(len(temp_doc)) # this didn't seem to work
Development of this gave a few hitches. It is important to understand how filters work. They can be called any number of times and fed an arbitrary chunk of data during the processing of a single request. The readline method does not seem to reliably read whole lines so line at a time processing was not as easy as it should have been. In the end, we set a temp_doc attribute on the request, used that to buffer the entire file, ran the regular expression over it, and then wrote it out. Performing the rewriting usually causes the length of the data to change, so that the content_length header is no longer correct. This can result in the browser stopping reading
before the end of the data is reached. Setting the header to the new length didn't seem to work,
so we resorted to removing the header altogether.
The module (file) containing the filter can be located anywhere convenient, provided the containing
directory is on the Python Path. This can be done using the PythonPath directive, best placed in /etc/apache2/conf.d/python.conf:
Finally, configure the filter in each reverse proxying virtual server
with (assuming the module is called mangleurls.py) something like:
PythonOutputFilter mangleurls MANGLEURLS
Virtual Servers Configuration
As was said earlier, there is a virtual server for each proxied server. The configuration specific
to each virtual server is contained in a file under /etc/apache2/sites-enabled. The virtual servers
use different ports on the same host name for ease of adding new ones. Care needs to be taken that no firewalls along the route block any of the ports used. Alternatively, a new IP and name could be used for each proxied site. Example virtual proxy server configuration looks something like:
# SSL Engine Switch:
# Enable/Disable SSL for this virtual host.
# for reverse proxy Off is correct
AuthName "Authentication Domain Goes Here"
Allow from all
ProxyPass / http://www.name.tld/
ProxyPassReverse / http://www.name.tld/
PythonOutputFilter mangleurls MANGLEURLS
We wished to automatically update the list of blocked accounts from a group within AD.
We did this with a Python script using python_ldap and run as a cron job. This turned out
to be fairly easy to do:
banned_list = 
conn = ldap.initialize("ldaps://domain.name.tld:3269")
results = conn.search_s('DN of blocked user group', ldap.SCOPE_BASE, 'objectClass=*', ['member'])
if len(results) != 1: # there can be only one!
sys.exit("found %d results from LDAP search, expected 1"%len(results))
rec = results
for dn in rec['member']: # we have a list of group member DNs, but we need their sAMAccountNames
r = conn.search_s(dn, ldap.SCOPE_BASE, 'objectClass=*', ['sAMAccountName'])
if len(r) == 1:
sys.exit("Unable to get sAMAccountName")
sys.exit("found %d results from LDAP search, expected 1"%len(r))
for u in banned_list: print u