Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When converting json to text it is converting characters weirdly #2624

Open
Godrules500 opened this issue Mar 21, 2022 · 12 comments
Open

When converting json to text it is converting characters weirdly #2624

Godrules500 opened this issue Mar 21, 2022 · 12 comments

Comments

@Godrules500
Copy link

When converting json data to a text or csv it is converting characters like ’ to ’

I see that changing it to use UTF will supposedly fix it according to other documents, but I only see the use case when reading the file and not writing/creating the document.

In your documentation it says that 'txt' is UTF-16, but "string" is UTF8. How would I go about changing that, or can I?

        var workSheet = XLSX.utils.json_to_sheet(results);
        var workBook = XLSX.utils.book_new();

        XLSX.utils.book_append_sheet(workBook, workSheet);
        var content = XLSX.write(workBook, { bookType: 'txt', type: 'string' });
@SheetJSDev
Copy link
Contributor

XLSX.write(workBook, {bookType: "csv", type: "string"});   // CSV as a JS string
XLSX.utils.sheet_to_csv(workSheet);                        // CSV from a worksheet (JS string)

binary will encode as binary strings.

When using writeFile, CSV will generate the binary output and add the UTF8 BOM. TXT aligns with "UTF-16 Text" in Excel. These choices optimize for the common case of generating files that will be read by Excel.

@Godrules500
Copy link
Author

Ok, so I've tried sheet_to_txt and write{bookType:'txt'}) and both are rendering the same way. Now I will say, that when I filter and convert one line, it is rendering the ’ character correctly. HOWEVER, when I return all 10,000 lines of data, it is then returning ’. Is there a reason why a small subset of data renders correctly, but the full set of data replaces certain characters?

@SheetJSDev
Copy link
Contributor

Does the same thing happen with CSV (using sheet_to_csv or write({bookType: "csv", type: "string"})?

In either case, can you share a sample? JSON.stringify(results) should be sufficient to test. If you can't share it publicly, email [email protected]

@Godrules500
Copy link
Author

Godrules500 commented Mar 21, 2022

test.txt

Here is the file. So weirdly I have an array of 10,000 lines. In trying to help get a smaller set of data I found that if I include 978-1232 it didn't work. If I took 979-5000 it worked.

So this file contains lines 978-1232. So in translation, the file on here is 1-255. So if I include line 1 in the sheet_to_txt it messes up the values. If I remove the first line, it works correctly.

Code:

var workSheet = XLSX.utils.json_to_sheet(results);
results = XLSX.utils.sheet_to_txt(workSheet);
return results;

@SheetJSDev
Copy link
Contributor

Please test the following:

results = XLSX.utils.sheet_to_txt(workSheet, {type: "string"});

If this works, the issue can be resolved with a small patch to https://github.com/SheetJS/sheetjs/blob/master/bits/90_utils.js#L153 (feel free to submit a PR):

	if(typeof $cptable == 'undefined' || opts.type == 'string' || !opts.type) return s;

@SheetJSDev SheetJSDev reopened this Mar 21, 2022
@Godrules500
Copy link
Author

Godrules500 commented Mar 21, 2022

Sadly it did not work and returned the same results :. Now I have pulled out the file since I'm working in a cloud system called Netsuite, so all I'm importing a file from either node_modules/xlsx/xlsx.js or node_modules/xlsx/dist/xlsx.extendedscript.js.

@Godrules500
Copy link
Author

Is there a way to read the content of json_to_sheet? That way I could determine if the file is being incorrectly manipulated there?

@SheetJSDev
Copy link
Contributor

The result is a plain JS object, you can directly inspect it. To get a specific cell, you can index with an Excel address:

var A1 = workSheet["A1"];

Please try using node_modules/xlsx/dist/xlsx.full.min.js or node_modules/xlsx/dist/xlsx.core.min.js (and be sure to pull the latest version from npm!). node_modules/xlsx/xlsx.js is designed for use in NodeJS.

Some small test https://jsfiddle.net/sheetjs/vmez8u1g/ shows the effect of the parameters. FF FE 19 20 is a binary encoded version with the BOM while 2019 corresponds to "\u2019" ()

@Godrules500
Copy link
Author

Continuing with the weird, in sheet_to_csv I added this return out.slice(2, 265).join("") and it works, but with slice(0 or 1, 265) it didn't work.

Ok, I'll try that and see. When I tried importing it in requirejs it was causing some issues and I couldn't run it.

@Godrules500
Copy link
Author

For whatever reason I cannot get it to load the xlsx.min.js nor xlsx.core.min.js to load using requirejs inside of Netsuite.

@SheetJSDev
Copy link
Contributor

NetSuite support was verified working in 0.16.1 and there was no change to the RequireJS logic since then. Is there some reported error?

@Godrules500
Copy link
Author

Godrules500 commented Mar 22, 2022

Ok, I have the xlsx.full.min.js working and I'm still getting the same issue. So I'm not sure if the code is converting it to a different character or if it is the javascript version causing the issue. since the issue seems to be when converting out to a string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants