[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openbsd-tech
Subject:    fgetwln(3) fails to report most encoding errors
From:       Ingo Schwarze <schwarze () usta ! de>
Date:       2016-08-21 14:16:57
Message-ID: 20160821141657.GA938 () athene ! usta ! de
[Download RAW message or body]

Hi,

did i mention already that libc wide character code is buggy as hell?
I looked at another very simple function of only 30 lines of code
and promptly found another bug.

The fgetwln(3) manual is quite explicit that the "fgetwln() function
may also fail ... for any of the errors specified for ... mbrtowc(3)"
and that it must return NULL in case of failure.  That's sensible;
we shouldn't expect programmers to inspect ferror(3) or errno(2)
after getting a function return value indicating success.

However, after reading a single valid character, fgetwln(3) will
mistreat all subsequent encoding errors as newlines - returning
success when encountering an invalid encoding, but still setting
both errno(3) and the stdio error indicator.

OK to commit the following patch?

Note that it will make programs using fgetwln(3), in particular in
ports, error out more frequently, and no longer permit reading of
streams containing encoding errors with this function.  But trying
to do so wasn't reliable in the past anyway, because encoding errors
right after newlines already caused the function to error out.

Also note that FreeBSD and NetBSD contain the same bug.
Actually, i found the bug because i played with FreeBSD rev(1) code
on OpenBSD and was surprised by its absurd behaviour when fed input
containing encoding errors.

Yours,
  Ingo


Index: stdio/fgetwln.c
===================================================================
RCS file: /cvs/src/lib/libc/stdio/fgetwln.c,v
retrieving revision 1.1
diff -p -U11 -r1.1 fgetwln.c
--- stdio/fgetwln.c	12 Jan 2015 20:58:07 -0000	1.1
+++ stdio/fgetwln.c	21 Aug 2016 14:00:32 -0000
@@ -59,23 +59,23 @@ fgetwln(FILE * __restrict fp, size_t *le
 
 	len = 0;
 	while ((wc = __fgetwc_unlock(fp)) != WEOF) {
 #define	GROW	512
 		if (len >= fp->_lb._size / sizeof(wchar_t) &&
 		    __slbexpand(fp, len + GROW))
 			goto error;
 		*((wchar_t *)fp->_lb._base + len++) = wc;
 		if (wc == L'\n')
 			break;
 	}
-	if (len == 0)
+	if (len == 0 || fp->_flags & __SERR)
 		goto error;
 
 	FUNLOCKFILE(fp);
 	*lenp = len;
 	return (wchar_t *)fp->_lb._base;
 
 error:
 	FUNLOCKFILE(fp);
 	*lenp = 0;
 	return NULL;
 }

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic