TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

The abominable strtok()

18 pointsby whackberryover 14 years ago

6 comments

tptacekover 14 years ago
I've heard lots of dev teams gripe about strsep(3) because it "isn't standard" or "isn't cross-platform". Preposterous! strsep is a ~20 line ANSI C function; it will compile and run flawlessly on any platform that doesn't natively provide it.<p>I agree strongly with the other comments on this thread that strsep(3) is just peachy, even though it "alters its inputs". Unlike in functional programs, reasonable destructive functions are to be <i>preferred</i> in C programs; it's easier to work around destructiveness than it is to track state (and/or to go through contortions to pretend that you aren't tracking state).
__david__over 14 years ago
First off, strtok() is dead, long live strsep().<p>Secondly, this is C code--it's pretty low-level. Yes, strsep() messes with your input. But that is very well documented. If you don't want it to, strdup() beforehand (or strndup() if you don't trust the input).<p>His whole example of strtok() dying on a character constant is stupid--why on earth would you do that? If you've got a string constant you may as well just have the constant array and save yourself the parsing headache.<p>This isn't rocket science.
评论 #1725584 未加载
评论 #1725288 未加载
jsolsonover 14 years ago
I began to have serious concerns about this article when I saw the author allocate and then immediately leak memory in the first example.<p>I agree with the message: don't use strtok, it has unpleasant side effects that you probably don't want. I do not feel this article does a good job of presenting that message. It spends too much time on pathologically bad examples of strtok usage while only briefly mentioning (and providing no example code for) any of the alternatives.
评论 #1725495 未加载
ComputerGuruover 14 years ago
I think anyone that's done even a little bit of C work on any platform is aware of this issue.... but it's always worth griping over it some more. I guess the poster just ran into it and couldn't help but express his frustration :)<p>For Win32 developers who don't have the glib g_strsplit function, you can use strtok_s which is detailed on MSDN here: <a href="http://msdn.microsoft.com/en-us/library/ftsafwz3(VS.80).aspx" rel="nofollow">http://msdn.microsoft.com/en-us/library/ftsafwz3(VS.80).aspx</a><p>strtok_s is re-entrant, thread-safe, and uses no global data, but keep in mind it still modifies your input string.
jswinghammerover 14 years ago
The man page covers the problems with strtok pretty well. I found it useful enough and while the code that uses it is pretty awkward it works well enough and gets the job done.<p>I think I that I have used it in shipping code two or three times. These days I don't do much in C so it's unlikely I'll ever use it again.
philwelchover 14 years ago
strtok() altering the source string is a natural consequence of null termination. The traditional Pascalian equivalent (put the string length at the head of the string) creates a symmetric problem at the head of the string. My (perhaps naive) idea is to use a struct containing the string length and pointing to the beginning of the string data. Nondestructive tokenization becomes simple.<p>This post actually gave me an impetus to go back and add tokenization to my string library that's built around this idea.
评论 #1725962 未加载