TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Show HN: sheet2dict – simple Python XLSX/CSV reader/to dictionary converter

59 pointsby pytlicekabout 4 years ago

10 comments

pytlicekabout 4 years ago
I created this tool for myself because I often work with xlsx and csv files. This is usually done through Python Pandas. But if you just need to read these files and work with the values in the lines, you don&#x27;t need to import the whole Pandas library. It is not even necessary to install Pandas for easier deployment. This can save you up to 2GB space if you do docker images with Pandas.<p>One of the things I still don&#x27;t understand are services like snyk.io, which are supposed to do security analysis. But they penalize a tool like this for not having CoC, Contributing in the GitHub repository, and what is most shocking to me is that they measure Popularity. I understand that if more people are involved in the SW, it is probably safer. But penalizing someone for having few stars on GitHub seems weird to me. Especially when the tool is used by several people &#x2F; companies and it has over 5,000 downloads.
评论 #26889617 未加载
评论 #26895399 未加载
BugsJustFindMeabout 4 years ago
You won&#x27;t be able to use this if your file doesn&#x27;t fit in RAM. This unnecessarily clones the file into a list instead of returning a generator and leaving the list conversion up to the user.
评论 #26891935 未加载
lettergramabout 4 years ago
I maintain a similar project, load any CSV, manipulate and get stats, detect sensitive data, etc<p><a href="https:&#x2F;&#x2F;github.com&#x2F;capitalone&#x2F;DataProfiler" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;capitalone&#x2F;DataProfiler</a><p>My question, how do you do header detection? That&#x27;s a _very_ difficult problem.
评论 #26896526 未加载
gpapilionabout 4 years ago
Isn’t this already in the csv module with dictreader?<p>Xlsx I know nothing about.
评论 #26892016 未加载
评论 #26890260 未加载
评论 #26890175 未加载
评论 #26890052 未加载
psingabout 4 years ago
There are many data engineers at companies who have to write custom little scripts to take data from spreadsheets into an analytics DB.<p>Thanks for removing some boilerplate from that process for people!
评论 #26891882 未加载
athoraxabout 4 years ago
For the opposite direction, I have had good luck with the XlsxWriter library<p><a href="https:&#x2F;&#x2F;github.com&#x2F;jmcnamara&#x2F;XlsxWriter" rel="nofollow">https:&#x2F;&#x2F;github.com&#x2F;jmcnamara&#x2F;XlsxWriter</a>
jquaintabout 4 years ago
Speaking of excel files, Does anyone know of a good way to port sheets with equations&#x2F;functions to python? Sometimes I need a calculations from a sheet and I have to manually copy them over.
unixheroabout 4 years ago
I usually import CSVs into a Python Pandas Dataframe and then iterate over the dataframe in a loop or manual line by line interventions and then beam the data out somewhere else...<p>Is this a better approach?
评论 #26889432 未加载
评论 #26890353 未加载
stuaxoabout 4 years ago
Nice, I did something like this and made it a gist ages ago for XLS, it&#x27;s good you&#x27;re putting up something that&#x27;s more maintained and working with multiple formats.
impoppyabout 4 years ago
It’d be better to use namedtuple to avoid repeating same dictionary keys imo
评论 #26889694 未加载