nsswitch.conf and sources issues...

Hi everybody,

this post is similar to a previous one that I had started in the past and that hadn't been answered: http://forums.freebsd.org/showthread.php?t=2978

This time it's more generic. /etc/nsswitch.conf claims to support a 'database -> sources [-> criterion] [->action]' syntax. Entries are given in the following form:
Code:
group: cache  [success=return notfound=continue unavail=continue tryagain=continue] ldap [success=return notfound=return unavail=return tryagain=return]  files
passwd: cache [success=return notfound=continue unavail=continue tryagain=continue] ldap [success=return notfound=return unavail=return tryagain=return] files
shells: files
In this example (which is the setup that troubles me, by the way), in the first line the system is instructed to read groups (the group database) from the cache (nscd), if the entry is found there it should stop, otherwise it should continue searching in ldap, if the entry is found it should stop, otherwise it should continue the search in files (usually /etc/group). The same procedure is expected for the passwd database, but for the shells database the system should search only in files.

My problem is that no matter what criteria I use and what actions I perform based on those criteria, some queries are always asked on my ldap backend regardless of the fact that the search was successful on some previous resource. The same applies to files as well; this is not an ldap-centric question. The additional queries asked are related to the groups the user in question belongs to.

For example, if I write:
$ id mamalos
Code:
uid=1000(mamalos) gid=513(Domain Users) groups=513(Domain Users),0,814
I get that mamalos belongs to group 513 (whose name and gidNumber are stored in ldap), and to groups 0 and 814 that do not resolve to a group name. The latter groups are stored in my local /etc/group file. What's funny, is that even though nsswitch finds that mamalos belongs to these groups, it somehow refuses to resolve their names...even though the appropriate information is stored in the exact same file :)

Now, what troubles me is that I am not sure how to make the search stop in the resource I wish, and hence, how to ensure that nscd will be in charge for answering cached queries and how will I ensure that nsswitch will stop asking further resources for those queries. At the moment, using the aforementioned configuration, when I ask twice about mamalos' id, my ldap.log reads:
Code:
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 fd=23 ACCEPT from IP=192.168.50.29:27734 (IP=0.0.0.0:389)
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=0 BIND dn="" method=163
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=0 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=1 BIND dn="" method=163
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=1 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=2 BIND dn="" method=163
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=2 BIND authcid="mamalos@EXAMPLE.COM" authzid="mamalos@EXAMPLE.COM"
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=2 BIND dn="uid=mamalos,ou=people,dc=example,dc=com" mech=GSSAPI sasl_ssf=56 ssf=56
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=2 RESULT tag=97 err=0 text=
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=3 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup))"
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=3 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=3 SEARCH RESULT tag=101 err=0 nentries=21 text=
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=4 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup)(gidNumber=0))"
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=4 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=4 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=5 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup)(gidNumber=814))"
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=5 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 op=5 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 21 18:57:50 ldap slapd[33005]: conn=1383 fd=23 closed (connection lost)

Mar 21 18:57:53 ldap slapd[33005]: conn=1384 fd=23 ACCEPT from IP=192.168.50.29:53894 (IP=0.0.0.0:389)
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=0 BIND dn="" method=163
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=0 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=1 BIND dn="" method=163
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=1 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=2 BIND dn="" method=163
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=2 BIND authcid="mamalos@EXAMPLE.COM" authzid="mamalos@EXAMPLE.COM"
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=2 BIND dn="uid=mamalos,ou=people,dc=example,dc=com" mech=GSSAPI sasl_ssf=56 ssf=56
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=2 RESULT tag=97 err=0 text=
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=3 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup))"
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=3 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=3 SEARCH RESULT tag=101 err=0 nentries=21 text=
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=4 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup)(gidNumber=0))"
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=4 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=4 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=5 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 filter="(&(objectClass=posixGroup)(gidNumber=814))"
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=5 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 op=5 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 21 18:57:53 ldap slapd[33005]: conn=1384 fd=23 closed (connection lost)
which, to my understanding, shows that even though the system should have stopped asking information about mamalos (since the query should have been cached), it doesn't. This way I am not able to minimize ldap traffic by local caching of nsswitch information, which is what I desire.

nscd is running with default configuration, ldap, kerberos, sasl/gssapi support all work like a charm, nss_ldap does its job (thanks to your help guys!), and the only thing that doesn't work -at the moment- is nsswitch.

Thank you all for your time and help in advance,

mamalos
 
Uo,

unfortunately it does exactly the same thing. You can read a very short discussion on this issue, without cache, by following the link to the previous post.

As you can realize, the implementation we discussed on the other thread regarding sasl/gssapi performance and nscd, is not that straight forward after all...:(

If you know some other documentation regarding nsswitch.conf to address me to it would be helpful. I'll take a short look into the source code, but I don't think I'll understand much.

Thanks again,

mamalos
 
AAh,

and another thing I forgot to mention: When I run[cmd=]# nscd -nst[/cmd]

Code:
nscd -nst 
M1 from main: request agents registered successfully
M2 from cache: cache was successfully initialized
M2 from runtime environment: using socket /var/run/nscd
M2 from runtime environment: successfully initialized
M1 from main: working in single-threaded mode

to troubleshoot nscd, the output never changes. Nevertheless, from the ldap point of view, I see differences when nscd is enabled or disabled. The difference in my logs is as follows:

When nscd is disabled, my /var/log/ldap.log shows:
Code:
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 fd=18 ACCEPT from IP=192.168.100.11:55620 (IP=0.0.0.0:389)
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=0 BIND dn="" method=163
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=0 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=1 BIND dn="" method=163
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=1 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=2 BIND dn="" method=163
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=2 BIND authcid="nss_auth_client@EXAMPLE.COM" 
authzid="nss_auth_client@EXAMPLE.COM"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=2 BIND dn="uid=nss_auth_client,ou=special 
users,dc=example,dc=com" mech=GSSAPI sasl_ssf=56 ssf=56
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=2 RESULT tag=97 err=0 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=3 SRCH base="ou=People,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixAccount)(uid=xaris))"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=3 SRCH attr=uid userPassword uidNumber gidNumber cn 
homeDirectory loginShell gecos description objectClass shadowLastChange shadowMax shadowExpire loginClass
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=3 SEARCH RESULT tag=101 err=0 nentries=1 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=4 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixGroup))"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=4 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=4 SEARCH RESULT tag=101 err=0 nentries=12 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=5 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixGroup)(gidNumber=10000))"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=5 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=5 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=6 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixGroup)(gidNumber=10000))"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=6 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=6 SEARCH RESULT tag=101 err=0 nentries=0 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=7 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixGroup)(gidNumber=10001))"
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=7 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 op=7 SEARCH RESULT tag=101 err=0 nentries=1 text=
Mar 22 11:44:29 ldap slapd[33005]: conn=3090 fd=18 closed (connection lost)

When nscd is enabled, /var/log/ldap.log shows:
Code:
Mar 22 11:48:03 ldap slapd[33005]: conn=2974 fd=15 closed (idletimeout)
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 fd=15 ACCEPT from IP=192.168.100.11:64527 (IP=0.0.0.0:389)
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=0 BIND dn="" method=163
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=0 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=1 BIND dn="" method=163
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=1 RESULT tag=97 err=14 text=SASL(0): successful result:
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=2 BIND dn="" method=163
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=2 BIND authcid="nss_auth_client@EXAMPLE.COM" 
authzid="nss_auth_client@EXAMPLE.COM"
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=2 BIND dn="uid=nss_auth_client,ou=special 
users,dc=example,dc=com" mech=GSSAPI sasl_ssf=56 ssf=56
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=2 RESULT tag=97 err=0 text=
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=3 SRCH base="ou=Groups,dc=example,dc=com" scope=2 deref=0 
filter="(&(objectClass=posixGroup))"
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=3 SRCH attr=cn userPassword memberUid uniqueMember gidNumber
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 op=3 SEARCH RESULT tag=101 err=0 nentries=12 text=
Mar 22 11:48:03 ldap slapd[33005]: conn=3102 fd=15 closed (connection lost)

which, if I understand correctly, means that when nscd is enabled, nsswitch does not resolve group ids to names; it does search for membership relations nonetheless (it still tries to find out which groups my user belongs to, by looking in the ldap and files databases). On the other hand, when nscd is disabled, nsswitch performs the whole lookup each time.
 
Hmm,

I found two more problems with nsswitch, regarding su this time:

1) If I use SASL authentication in nss_ldap, then when I try to:

Code:
$ su -
Password:
#

I get the following error on /var/log/messages:

Code:
Mar 28 18:17:03 mamalacation su: GSSAPI Error:  Miscellaneous failure (see text) (open(/tmp/krb5cc_1001): No such file or directory^B)
Mar 28 18:17:03 mamalacation su: nss_ldap: could not search LDAP server - Server is unavailable

which is a bit peculiar, since 1001 is my default user (mamalos), and it seems that su tries to find a principal with a uid=1001 when it tries to access nsswitch information, instead of using the sasl_authid user from /usr/local/etc/nss_ldap.conf.

If I use binddn=blabla, then everything works just fine.

2) When my /etc/nsswitch.conf reads:
Code:
hosts: ldap files
group: ldap files

and I run:
Code:
$ id mamalos
uid=1001(mamalos) gid=513(Domain Users) groups=513(Domain Users),512(Domain Admins),0(wheel),814(puppet)

we can see that mamalos is a member of the wheel group. But when I try to su to root I get a "BAD SU mamalos to root" in /var/log/messages. When my /etc/nsswitch.conf reads:
Code:
hosts: files ldap
group: files ldap
and I run:
Code:
$ id mamalos
uid=1001(mamalos) gid=1001(mamalos) groups=1001(mamalos),0(wheel),814(puppet),512(Domain Admins)

then, suing to root works just fine. The same holds when I run id (with no arguments); in that case, the system replies with the information it finds in the first resource (ldap on the first example, file on the second) and then it stops.

That said, I come into two separate conclusions:
1) SASL authentication in nss_ldap doesn't seem to be fully functional.
2) It seems that some commands call functions that honor the criteria/actions fields of /etc/nsswitch.conf, while other commands call functions that don't (eg. id, getent).

Any comments on these issues are welcome,

mamalos

PS. I have tried to start a conversation on the stable@freebsd.org mailing list regarding the exact same problem (in case someone helps), and this post is a direct copy from my last message to the list. Potential answers will be submitted to this forum too.
 
Back
Top