Discussion:
[nuttx] Question about pthread_mutex_unlock()
eunb.song@samsung.com [nuttx]
2017-03-22 11:08:48 UTC
Permalink
Hi, Greg.
I compared pthread_mutex_unlock implementation between Nuttx and glibc.


And i found a difference when pthread_mutex_unlock() is called by process which does not call pthead_mutex_lock()


In case of, Nuttx, return EPERM;
int pthread_mutex_unlock(FAR pthread_mutex_t *mutex)

{
,,,
if (mutex->pid != (int)getpid()) {
/* No... return an error (default behavior is like PTHREAD_MUTEX_ERRORCHECK) */ serr("ERROR: Holder=%d returning EPERM\n", mutex->pid);
ret = EPERM;
}



in case of glibc, you can see http://git.savannah.gnu.org/cgit/hurd/libpthread.git/tree/sysdeps/generic/pt-mutex-unlock.c http://git.savannah.gnu.org/cgit/hurd/libpthread.git/tree/sysdeps/generic/pt-mutex-unlock.c


switch (attr->__mutex_type) { case PTHREAD_MUTEX_ERRORCHECK: case PTHREAD_MUTEX_RECURSIVE: if (mutex->__owner != _pthread_self ()) { __pthread_spin_unlock (&mutex->__lock); return EPERM; }-> only return EPERM for ERRORCHECK and RECURSIVE type.
So, in my point of view, below changes are needed. diff --git a/sched/pthread/pthread_mutexunlock.c b/sched/pthread/pthread_mutexunlock.c index 078d909..fa17a58 100644 --- a/sched/pthread/pthread_mutexunlock.c +++ b/sched/pthread/pthread_mutexunlock.c @@ -98,7 +98,8 @@ int pthread_mutex_unlock(FAR pthread_mutex_t *mutex) /* Does the calling thread own the semaphore? */ - if (mutex->pid != (int)getpid()) + if (((mutex->type == PTHREAD_MUTEX_RECURSIVE) || (mutex->type == PTHREAD_MUTEX_ERRORCHECK)) + && (mutex->pid != (int)getpid())) { /* No... return an error (default behavior is like PTHREAD_MUTEX_ERRORCHECK) */

I would appreciate if you say your opinion.
'Juha Niskanen (Haltian)' juha.niskanen@haltian.com [nuttx]
2017-03-22 12:06:34 UTC
Permalink
Hi,


I'm not Greg but here's my take anyway. I don't understand how you infer this kind of change is needed, as that conclusion does not follow from your premises. NuttX code is correct and easier to read than its counterpart in Glibc.


Your change has no utility: it just adds extra logic for hiding errors for buggy code that calls pthread_mutex_unlock() from wrong thread or for unlocked mutex but check the return value != 0. Certainly callers cannot _expect_ to get EPERM, as this is undefined behavior in POSIX, but our manifestation of undefined behavior here makes sense.


Best Regards,

Juha

________________________________
From: ***@yahoogroups.com <***@yahoogroups.com> on behalf of ***@samsung.com [nuttx] <***@yahoogroups.com>
Sent: Wednesday, March 22, 2017 1:08:48 PM
To: ***@yahoogroups.com
Subject: [nuttx] Question about pthread_mutex_unlock()




Hi, Greg.

I compared pthread_mutex_unlock implementation between Nuttx and glibc.


And i found a difference when pthread_mutex_unlock() is called by process which does not call pthead_mutex_lock()


In case of, Nuttx, return EPERM;

int pthread_mutex_unlock(FAR pthread_mutex_t *mutex)

{

,,,
if (mutex->pid != (int)getpid()) {
/* No... return an error (default behavior is like PTHREAD_MUTEX_ERRORCHECK) */ serr("ERROR: Holder=%d returning EPERM\n", mutex->pid);
ret = EPERM;
}



in case of glibc, you can see http://git.savannah.gnu.org/cgit/hurd/libpthread.git/tree/sysdeps/generic/pt-mutex-unlock.c


switch (attr->__mutex_type)
{
case PTHREAD_MUTEX_ERRORCHECK:
case PTHREAD_MUTEX_RECURSIVE:
if (mutex->__owner != _pthread_self ())
{
__pthread_spin_unlock (&mutex->__lock);
return EPERM;
}-> only return EPERM for ERRORCHECK and RECURSIVE type.
So, in my point of view, below changes are needed. diff --git a/sched/pthread/pthread_mutexunlock.c b/sched/pthread/pthread_mutexunlock.c
index 078d909..fa17a58 100644
--- a/sched/pthread/pthread_mutexunlock.c
+++ b/sched/pthread/pthread_mutexunlock.c
@@ -98,7 +98,8 @@ int pthread_mutex_unlock(FAR pthread_mutex_t *mutex)

/* Does the calling thread own the semaphore? */

- if (mutex->pid != (int)getpid())
+ if (((mutex->type == PTHREAD_MUTEX_RECURSIVE) || (mutex->type == PTHREAD_MUTEX_ERRORCHECK))
+ && (mutex->pid != (int)getpid()))
{
/* No... return an error (default behavior is like PTHREAD_MUTEX_ERRORCHECK) */


I would appreciate if you say your opinion.
eunb.song@samsung.com [nuttx]
2017-03-22 12:40:22 UTC
Permalink
Hi, Juha.


Thanks for your reply.
Actually, i ported an open source and i suffered from because of different return value of ptherad_mutex_unlock in situation i mentioned before.
That's the why i raised this question.
I think the best thing for nuttx is that it's really easy to port open source to nuttx.
So i think it's very important to keep consistency between nuttx and linux or glibc if possible.
spudarnia@yahoo.com [nuttx]
2017-03-22 12:46:31 UTC
Permalink
Actually, there is a functional difference in this, not just an error reporting difference.


The functional difference is that this change permits the mutex to be released by a different thread if it is not a recursive mutex. That is new behavior and, I think probably correct.


If the mutex is not recursive, but cannot be released by another thread, then then setting the type to PTHREAD_MUTEX_ERRORCHECK will prohibit that.


That is a pretty fine interpretation of the standard, but is consistent with the way that GLIBC does things. So I am inclined to agree and to incorporate the change.


Greg
spudarnia@yahoo.com [nuttx]
2017-03-22 13:05:15 UTC
Permalink
So I think that the real question is,"Is this change in behavior correct?" Just because GLIBC does it, that does not mean that it is correct. The behavior of pthread_mutex_lock() should be governed by POSIX not be what other systems do.

Looking at

http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html

It is hard to tweeze out what the real requirement. But this page has more information:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html

And I would interpret that to say that the functional change is correct.

I see that the requiremet if the type is not ERRORCHECK is to deadlock if pthread_mutex is called twice. I don't like that, but that is the standard.
spudarnia@yahoo.com [nuttx]
2017-03-22 13:26:40 UTC
Permalink
Post by ***@yahoo.com [nuttx]
And I would interpret that to say that the functional change is correct.
No, I am wrong. According to http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html , pthread_mutex_unlock() should ALWAYS return an error if you attempt to unlock a mutex that you do not hold. That means that the current code is correct because it does just that.

The only violation of the standard that I see is in GLIBC for pthread_mutex_unlock() and in NuttX for pthread_mutex_lock(). The standard requires that a normal mutex should deadlock if you attempt to relock it. If you want the error, then you would have to change the type to ERRORCHECK.

The spec requires that change, but it scares me. I can't imagine why it would be a good think to permit an irrecoverable deadlock and there could be many applications that might break. I am inclined to just add a comment to the code and leave it non-compliant.

Hmmm... that comment was already there but could be beefed up:

/* No, then we would deadlock... return an error (default behavior
* is like PTHREAD_MUTEX_ERRORCHECK)
*/

Greg
'Juha Niskanen (Haltian)' juha.niskanen@haltian.com [nuttx]
2017-03-22 13:35:58 UTC
Permalink
"No, I am wrong. According to http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html , pthread_mutex_unlock() should ALWAYS return an error if you attempt to unlock a mutex that you do not hold."


No, I don't think you were wrong [😊] This is the relevant sentence: "If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, pthread_mutex_unlock() shall behave as described in the Unlock When Not Owner column of the following table" The you can implemented the "undefined" cases of that table any way you want, either Glibc way or current NuttX way, both are conforming implementations.


Best Regards,

Juha

________________________________
From: ***@yahoogroups.com <***@yahoogroups.com> on behalf of ***@yahoo.com [nuttx] <***@yahoogroups.com>
Sent: Wednesday, March 22, 2017 3:26:40 PM
To: ***@yahoogroups.com
Subject: [nuttx] Re: Question about pthread_mutex_unlock()
Post by ***@yahoo.com [nuttx]
And I would interpret that to say that the functional change is correct.
No, I am wrong. According to http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html , pthread_mutex_unlock() should ALWAYS return an error if you attempt to unlock a mutex that you do not hold. That means that the current code is correct because it does just that.

The only violation of the standard that I see is in GLIBC for pthread_mutex_unlock() and in NuttX for pthread_mutex_lock(). The standard requires that a normal mutex should deadlock if you attempt to relock it. If you want the error, then you would have to change the type to ERRORCHECK.

The spec requires that change, but it scares me. I can't imagine why it would be a good think to permit an irrecoverable deadlock and there could be many applications that might break. I am inclined to just add a comment to the code and leave it non-compliant.

Hmmm... that comment was already there but could be beefed up:

/* No, then we would deadlock... return an error (default behavior
* is like PTHREAD_MUTEX_ERRORCHECK)
*/

Greg
spudarnia@yahoo.com [nuttx]
2017-03-22 13:54:08 UTC
Permalink
Post by 'Juha Niskanen (Haltian)' ***@haltian.com [nuttx]
"No, I am wrong. According to http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html , pthread_mutex_unlock() should ALWAYS return an error if you attempt to unlock a mutex that you do not hold."
No, I don't think you were wrong This is the relevant sentence: "If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, pthread_mutex_unlock() shall behave as described in the Unlock When Not Owner column of the following table" The you can implemented the "undefined" cases of that table any way you want, either Glibc way or current NuttX way, both are conforming implementations.
In NuttX, the DEFAULT is the same as NORMAL. So yes, you are right for the case of the non-robust NORMAL mutex, the behavior if you attempt to unlock a mutex that you do not hold is undefined. The robust NORMAL mutex will return an error.

That is even clearer in the statements below:

[EPERM]
The mutex type is PTHREAD_MUTEX_ERRORCHECK or PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the current thread does not own the mutex.

Based on that, I would reverse myself again and tend to agree with the change. But I guess then the question is what would we willing to accept for the undefined behavior for the case of the non-robust mutex?

"Undefined behavior" is a code for "never to do that because you don't know what will happen." Asserting or faulting would be a valid undefined behavior. Code that depends on how a particular system implements undefined behavior is non-portable and you get what you get.

Greg
spudarnia@yahoo.com [nuttx]
2017-03-22 14:00:37 UTC
Permalink
Post by ***@yahoo.com [nuttx]
But I guess then the question is what would we willing to accept for the undefined behavior for the case of the non-robust mutex?
With the proposed change, the "undefined behavior" would be to permit one thread to unlock a mutex held by a different thread. I don't see any justification for that kind of behavior, defined or undefined.
'Juha Niskanen (Haltian)' juha.niskanen@haltian.com [nuttx]
2017-03-22 13:28:48 UTC
Permalink
Hi Greg,


Comment in line 64-65 of nuttx/sched/pthread/pthread_mutexlock.c says: "If the mutex type is PTHREAD_MUTEX_NORMAL, deadlock detection is not provided. Attempting to relock the mutex causes deadlock." but the code returns EDEADLK in this case, contradicting the comment. I concur that it should actually deadlock per OpenGroup specification. I don't like it either, but at least the comment should be edited to match actual behavior.


Back to unlock from wrong thread case: the open source program that tries to do this is buggy. All the world is not glibc. The buggy program assumes certain behavior for undefined behavior. It will bomb again when ported to some other platform next time. Mutexes have owners for a reason, one has to use semaphores if want to have ownerless synchronization primitive. Suppose threads A and B access critical section protected by mutex. Some other thread C unblocks mutex when A or B is busy working with it. Can it be even called a critical section, if its protection can vanish at arbitrary point?


Best Regards,

Juha




________________________________
From: ***@yahoogroups.com <***@yahoogroups.com> on behalf of ***@yahoo.com [nuttx] <***@yahoogroups.com>
Sent: Wednesday, March 22, 2017 3:05:15 PM
To: ***@yahoogroups.com
Subject: Re: [nuttx] Question about pthread_mutex_unlock()



So I think that the real question is,"Is this change in behavior correct?" Just because GLIBC does it, that does not mean that it is correct. The behavior of pthread_mutex_lock() should be governed by POSIX not be what other systems do.

Looking at

http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_mutex_lock.html

It is hard to tweeze out what the real requirement. But this page has more information:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html

And I would interpret that to say that the functional change is correct.

I see that the requiremet if the type is not ERRORCHECK is to deadlock if pthread_mutex is called twice. I don't like that, but that is the standard.
spudarnia@yahoo.com [nuttx]
2017-03-27 16:02:34 UTC
Permalink
Hi, eunb.song,

After thinking about this for a long time. I thought about your comments and about the things I learned about pthread mutexes, especially this page http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html

I have just finished a major change to the pthread mutex logic. I think it now meets both your needs and all of the requirements of the POSIX standard. There are some new configuration options. I list them below:

choice
prompt "pthread mutex robustness"
default PTHREAD_MUTEX_ROBUST if !DEFAULT_SMALL
default PTHREAD_UNSAFE if DEFAULT_SMALL

config PTHREAD_MUTEX_ROBUST
bool "Robust mutexes"
---help---
Support only the robust form of the NORMAL mutex.

config PTHREAD_MUTEX_UNSAFE
bool "Traditional unsafe mutexes"
---help---
Support only the traditional non-robust form of the NORMAL mutex.
You should select this option only for backward compatibility with
software you may be porting or, perhaps, if you are trying to minimize
footprint.

config PTHREAD_MUTEX_BOTH
bool "Both robust and unsafe mutexes"
---help---
Support both forms of NORMAL mutexes.

endchoice # pthread mutex robustness

choice
prompt "Default NORMAL mutex robustness"
default PTHREAD_MUTEX_DEFAULT_ROBUST
depends on PTHREAD_MUTEX_BOTH

config PTHREAD_MUTEX_DEFAULT_ROBUST
bool "Robust default"
---help---
The default is robust NORMAL mutexes (non-standard)

config PTHREAD_MUTEX_DEFAULT_UNSAFE
bool "Unsafe default"
---help---
The default is traditional unsafe NORMAL mutexes (standard)

endchoice # Default NORMAL mutex robustness

So with these options, you support pthread mutexes which are the traditional, unsafe mutexes that you want, modern robust mutexes, or either. This includes some new standard interface functions:

pthread_mutex_consistent()
pthread_mutexattr_getrobust()
pthread_mutexattr_setrobust()

So if you select CONFIG_PTHREAD_MUTEX_UNSAFE=y you should get the behavior that you want.

If you select CONFIG_PTHREAD_MUTEX_BOTH=y, you can also get that behavior if you also call pthread_mutex_setrobust(mutex, PTHREAD_MUTEX_STALLED) in order to enable the unsafe behavior.

Greg
Jussi Kivilinna jussi.kivilinna@haltian.com [nuttx]
2017-03-28 13:53:26 UTC
Permalink
Hello,

It seems that PTHREAD_MUTEX_INITIALIZER was not updated to match the changes to pthread_mutex_t. Here's patch to do just that.

-Jussi
Post by ***@yahoo.com [nuttx]
Hi, eunb.song,
After thinking about this for a long time. I thought about your comments and about the things I learned about pthread mutexes, especially this page http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html
choice
prompt "pthread mutex robustness"
default PTHREAD_MUTEX_ROBUST if !DEFAULT_SMALL
default PTHREAD_UNSAFE if DEFAULT_SMALL
config PTHREAD_MUTEX_ROBUST
bool "Robust mutexes"
---help---
Support only the robust form of the NORMAL mutex.
config PTHREAD_MUTEX_UNSAFE
bool "Traditional unsafe mutexes"
---help---
Support only the traditional non-robust form of the NORMAL mutex.
You should select this option only for backward compatibility with
software you may be porting or, perhaps, if you are trying to minimize
footprint.
config PTHREAD_MUTEX_BOTH
bool "Both robust and unsafe mutexes"
---help---
Support both forms of NORMAL mutexes.
endchoice # pthread mutex robustness
choice
prompt "Default NORMAL mutex robustness"
default PTHREAD_MUTEX_DEFAULT_ROBUST
depends on PTHREAD_MUTEX_BOTH
config PTHREAD_MUTEX_DEFAULT_ROBUST
bool "Robust default"
---help---
The default is robust NORMAL mutexes (non-standard)
config PTHREAD_MUTEX_DEFAULT_UNSAFE
bool "Unsafe default"
---help---
The default is traditional unsafe NORMAL mutexes (standard)
endchoice # Default NORMAL mutex robustness
pthread_mutex_consistent()
pthread_mutexattr_getrobust()
pthread_mutexattr_setrobust()
So if you select CONFIG_PTHREAD_MUTEX_UNSAFE=y you should get the behavior that you want.
If you select CONFIG_PTHREAD_MUTEX_BOTH=y, you can also get that behavior if you also call pthread_mutex_setrobust(mutex, PTHREAD_MUTEX_STALLED) in order to enable the unsafe behavior.
Greg
--
Jussi Kivilinna
Chief Architect, Embedded SW
Haltian Ltd
http://www.haltian.com/
spudarnia@yahoo.com [nuttx]
2017-03-28 15:08:09 UTC
Permalink
Yes you are right. Thanks. I missed that.


Greg
eunb.song@samsung.com [nuttx]
2017-03-28 23:34:44 UTC
Permalink
Hi. Greg. Thanks for your patch and nice feedback on my opinion.
I will check and test your new patch regarding pthread's mutex.


Thanks again.
Jussi Kivilinna jussi.kivilinna@haltian.com [nuttx]
2017-03-29 11:00:45 UTC
Permalink
Hello,

I found more mutex breakage. Following bit of code causes assert when pthread_mutex_unlock gets called after trylock:

pthread_mutex_t mutex;
pthread_mutex_init(&mutex, NULL);
if (pthread_mutex_trylock(&mutex) == 0)
pthread_mutex_unlock(&mutex);

[ 45.150000]up_assert: Assertion failed at file:pthread/pthread_mutex.c line: 159

-Jussi
Post by ***@yahoo.com [nuttx]
Yes you are right. Thanks. I missed that.
Greg
--
Jussi Kivilinna
Chief Architect, Embedded SW
Haltian Ltd
http://www.haltian.com/
spudarnia@yahoo.com [nuttx]
2017-03-29 13:56:37 UTC
Permalink
Good find, thank you. I have committed a fix. The description in the commit says it all:

commit eb344d7260731d850b64a201ed085496c4dce0cf
Author: Gregory Nutt <***@nuttx.org>
Date: Wed Mar 29 07:50:40 2017 -0600

Fix an assertion noted by Jussi Kivilinna.

This was a consequence of the recent robust mutex changes. If robust mutexes are selected, then each mutex that a thread takes is retained in a list in threads TCB. If the thread exits and that list is not empty, then we know that the thread exitted while holding mutexes. And, in that case, each will be marked as inconsistent and the any waiter for the thread is awakened.

For the case of pthread_mutex_trywait(), the mutex was not being added to the list! while not usually a fatal error, this was caught by an assertion when pthread_mutex_unlock() was called: It tried to remove the mutex from the TCB list and it was not there when, of course, it shoule be.

The fix was to add pthread_mutex_trytake() which does sem_trywait() and if successful, does correctly add the mutext to the TCB list. This should eliminated the assertion.
Jussi Kivilinna jussi.kivilinna@haltian.com [nuttx]
2017-03-30 11:26:52 UTC
Permalink
Thanks for the fix.

-Jussi
Post by ***@yahoo.com [nuttx]
commit eb344d7260731d850b64a201ed085496c4dce0cf
Date: Wed Mar 29 07:50:40 2017 -0600
Fix an assertion noted by Jussi Kivilinna.
This was a consequence of the recent robust mutex changes. If robust mutexes are selected, then each mutex that a thread takes is retained in a list in threads TCB. If the thread exits and that list is not empty, then we know that the thread exitted while holding mutexes. And, in that case, each will be marked as inconsistent and the any waiter for the thread is awakened.
For the case of pthread_mutex_trywait(), the mutex was not being added to the list! while not usually a fatal error, this was caught by an assertion when pthread_mutex_unlock() was called: It tried to remove the mutex from the TCB list and it was not there when, of course, it shoule be.
The fix was to add pthread_mutex_trytake() which does sem_trywait() and if successful, does correctly add the mutext to the TCB list. This should eliminated the assertion.
Loading...