Re-building the kerberos database on OS X 10.6 server.

May 12th, 2010 by Oliver Helm Leave a reply »

I had a problem with a testing server earlier today where the kerberos database had become corrupt. For any users on my Open Directory Master the kerberos passwords were flagged as incorrect, and changing them from work group manager had no effect. Changing them from the command line was not an option as this relies on knowing the users original password – which was corrupt.

The kerberos service can be restarted by greping the output of the ‘ps aux’ command for the kerberos process, (usually named ‘krd5kdc’) and then issuing a kill command against its PID. The service will then automatically restart.

ps aux | grep krd
kill kerberos_PID

This slightly improved the problem as it allowed newly created users to use kerberos with their correct password.

Re-building the kerberos database is done with the following command:

slapconfig -kerberize -f diradmin

This needs to be run as root, either directly of via sudo. The -f flag forces the current set up to be over written.
I would recommend taking a full backup of your users and groups, as well as an archive of your Open Directory server from server admin. Stopping any services that rely on kerberos would also be a good idea.

Re-building the kerberos database from scratch.

If neither of the above options worked then it is possible to rebuild your kerberos database from scratch – nuking your old database. This would also be necessary if you are changing the the Kerberos Domain, however don’t forget that if doing this you would also have to change the search path in all your LDAM and Password Server databases.

To completely rebuild kerberos.

1) Stop the OD Service.
2) Log into a shell as root and run the following command:

sso_util remove -k 0a diradmin -p your_diradmin_password -r your_kerberos_realm

3) Remove the following files and directories from your system:

/var/db/krb5kdv/
/Library/Preferences/edu.mit.Kerberos
/etc/krb5.keytab

4) Run the following set of commands as root:

dscl 127.0.0.1
cd /LDAPv3/127.0.0.1/Config/
auth diradmin (and enter your diradmin password)
delete KerberosKDC
delete KerberosClient
quit

5) find and kerberos process (krb5kdc) and kadmind processed and kill them, (as shown above).
6) Re-build your kerberos database:

slapconfig -kerberize diradmin

Testing Kerberos:

You can check if the users’ passwords are now being accepted using the ‘kpasswd’ command. The ‘kinit’ command can also be used to test creating a kerberos ticket.

Share and Enjoy:
  • Print
  • Digg
  • del.icio.us
  • Facebook
  • LinkedIn
  • StumbleUpon
  • Twitter
  • email
  • Google Bookmarks
Advertisement

14 comments

  1. Clint McIntosh says:

    How do you stop the “OD Service”? There is no Stop button in Server Admin.

  2. Oliver Helm says:

    Hi Clint McIntosh,

    The best way to stop the OD Service is to issue the following command as root (from shell):

    sudo launchctl unload /System/Library/LaunchDaemons/org.openldap.slapd.plist

    It can then be restarted in a similar manor with:

    sudo launchctl load /System/Library/LaunchDaemons/org.openldap.slapd.plist

    If all else fails search for the process ID using “ps aux” and kill it.

    Hope that helps. – Thanks for your comment.

    - Oliver

  3. Jon says:

    Hi,

    thanks for this nice manual at first.

    I guess I’ve got a very tricky configuration.
    After upgrading from 10.5 I recognized that kerberos is not running. So I tried your steps. Besides of stopping OD Service :( I did it step by step.
    But after entering slapconfig -kerberize diradmin
    I get this error: kdcsetup command failed with exit code 255
    and have no clue what to do.
    After this the “Kerberize” Button appeared in Server-Admin. But it looks like it does not accept any of my authentication. Neither sysadmin nor diradmin.

  4. Oliver Helm says:

    Hi Jon,

    Glad you found the post useful.

    This sounds like a strange one though! First off, when you started the stages above where there defiantly no kerberos services running? You can check this easily using the ‘ps aux’ command. Everything needs to be killed before running through the steps in the manual above.

    Assuming there was nothing running I would try the following:

    1) Do you see the diradmin and sysadmin users in workgroup manager? If so try just changing the password for these users from there, as it may just sort out any small corruption in the database.

    2) The next thing I would check would be the DNS and Reverse DNS records. I might be an idea to enter the associated hosts into the servers local hosts file (/etc/hosts). All of the Directory services rely heavily on DNS records being correct.

    3) Lastly you can add a ‘-f’ flag to the slapconfig command if you wish to force a server to be kerberized, but i’m not sure that would help much in this situation.

    Failing all of the above could you post any relevant looking bits form the slapconfig.log up and i’ll have another go!

    Thanks for your comment.

    - Oliver

  5. Jon says:

    Hi Oliver,

    thank you very much for your quick and extensive answer.

    I checked it and did the steps of your manual again for beginning.
    So I found out that there are some more issues on my system which I didn’t really cared for at my first run.

    1. the command ‘dscl 127.0.0.1′ does not work. I have to take ‘dscl localhost’ instead.

    2. same issue with ‘cd /LDAPv3/127.0.0.1/Config/’. Here I had to replace 127.0.0.1 to OS-X-Server, which is my servers name.

    I guess this must be one reason why ‘slapconfig -kerberize diradmin’ does not works because I see in the logs that the contained commands use the same variables than yours.
    But I don’t have any idea how and there to change this.

    Maybe it might be better if I do a clean install and import the users. But than I don’t know how to import them with passwords because they are not included in the workgroup managers export. The import/export of archives feature in server admin does not work with my database. Maybe because of the same failures.

    I thank you very much for your response! Hope you have an idea.

  6. Oliver Helm says:

    Hi Jon,

    Nice, haven’t seen that one before! There’s clearly something going wrong with the mapping of the localhost address there, and it would defiantly explain why the slapconfig command is failing. Can you stick a copy of your /etc/hosts file up?

    If you still have an active copy of the database somewhere, other then on this problematic server, you can take a full backup of the LDAP server, including the password database using the ‘slapconfig -backupdb’ command.

    Keep your findings coming!

    - Oliver

  7. Oliver Helm says:

    Jon,

    Also, can you run ‘ifconfig’ from shell and put the results up? I wonder if your Loopback interface has not come up…

    - Oliver

  8. Jon says:

    Hi Oliver,

    jep this really looks pretty weird.
    Than I looked into /etc/hosts I found at the end two very strange entries, which I would delete normally…

    Here comes the information you wanted:

    /etc/hosts
    ##
    # Host Database
    #
    # localhost is used to configure the loopback interface
    # when the system is booting. Do not change this entry.
    ##
    127.0.0.1 localhost
    255.255.255.255 broadcasthost
    ::1 localhost
    fe80::1%lo0 localhost

    lo0: flags=8049 mtu 16384
    inet6 ::1 prefixlen 128
    inet6 fe80::1%lo0 prefixlen 64 scopeid 0×1
    inet 127.0.0.1 netmask 0xff000000
    gif0: flags=8010 mtu 1280
    stf0: flags=0 mtu 1280
    en0: flags=8863 mtu 1500
    ether 00:1f:5b:31:90:9c
    media: autoselect
    status: inactive
    en1: flags=8863 mtu 1500
    ether 00:1f:5b:31:90:9d
    inet6 fe80::21f:5bff:fe31:909d%en1 prefixlen 64 scopeid 0×5
    inet 192.168.0.1 netmask 0xffff0000 broadcast 192.168.255.255
    media: autoselect (100baseTX )
    status: active
    fw0: flags=8822 mtu 4078
    lladdr 00:1f:f3:ff:fe:0c:53:de
    media: autoselect
    status: inactive
    ppp0: flags=8051 mtu 1444
    inet 192.168.0.1 –> 192.168.3.102 netmask 0xffffff00

    Jon

  9. Jon says:

    Oh sorry, I forgot to mark the beginning of ifconfigs output.
    This is at ‘lo0: flags=8049 mtu 16384′

    Jon

  10. Oliver Helm says:

    Hi Jon,

    The odd looking lines at the end of the hosts file are there for the IPv6 support, so defiantly need to stay in! The loop back interface is clearly coming up ok, so i guess that might narrow it down to the routing table. Could you run “netstat -nr” as root and put the results up? It should look basically like this: (sorry for the slightly messed up formatting!)

    Destination Gateway Flags Refs Use Netif Expire
    127 127.0.0.1 UCS 0 0 lo0
    127.0.0.1 127.0.0.1 UH 3 123755 lo0
    169.254.137.72 127.0.0.1 UHS 0 0 lo0
    192.168.1.73 127.0.0.1 UHS 0 0 lo0

    ::1 ::1 UH lo0
    fd08:a43e:9507:e927:d69a:20ff:fe00:37e2 link#1 UHL lo0
    fe80::%lo0/64 fe80::1%lo0 Uc lo0
    fe80::1%lo0 link#1 UHL lo0
    fe80::d69a:20ff:fe72:b63e%en1 d4:9a:20:72:b6:3e UHL lo0
    fe80::21c:42ff:fe00:8%en2 0:1c:42:0:0:8 UHL lo0
    ff01::/32 ::1 Um lo0
    ff02::/32 ::1 UmC lo0

    Keep us updated!

    - Oliver

    P.S. I remembered that I have seen this once before. It turned out to be that the machine did not have enough memory to bring the lo interface up on start up. I assume that you have plenty of memory in your box?

  11. Jon says:

    Hi Oliver,

    thanks for your help and your time.

    Unfortunately I had to solve this problem quickly.
    So I made it over the weekend with ‘brute force’. I did a reinstall and build a new OD database. Now everything is fine except the users lost their passwords. But that was bearable.

    I’m really happy about your help. Sorry but I had to hurry.

    btw: yes you were right. That machine had plenty of memory.

    Greetings Jon

  12. betty says:

    Hi, Oliver.
    Nice howto regarding kerberos.
    Where can I find the similar stuff regardning The Passwordserver on a 10.6 server.
    When the RSA private key is corrupt in the password server, no replicas can be made successfully. Haven’t found a way to rebuild it without wiping the passwords.
    you got a clue on that?

  13. Thank yor for your advice. It worked for us. But you wrote a command in wrong order:

    slapconfig -kerberize diradmin -f

    correct:

    slapconfig -kerberize -f diradmin

    Thx again!

  14. Oliver Helm says:

    Good spot, thanks! I’ve corrected it now.

Leave a Reply

Comment Spam Protection by WP-SpamFree