Very strange SearchStr behavior... - Forum

Forum Navigation
You need to log in to create posts and topics.

Very strange SearchStr behavior...

Dear all,

I have a really strange problem and hope someone can put me in a right direction:-)

So, I have text file, TAB-delimited, and, when I search string I need (it's a actually a sentence), all editors found this string without problems:-)

But, when I looking for same string through SearchStr function - result is 0!

How is that possible ?

Couple of hints :

1.I don't create a text file, it's a some kind of export/dump from online shop database, come to my mail weekly

2.Some fields are multi-line

Any idea/comment/tip ?

 

Thanx in advance:-)

Anyone ?

Have a nice day!
Attach a file with a sample of the problem so that you can test it at home.

@vadim

Thank you for quick response:-)

In attachment you can find a small part of a file.

String I can't find using SearchStr function from Neobook is:

Ultra Move – Die besondere Materialzusammensetzung sorgt für maximale Bewegungsfreiheit ohne dabei den klassischen Deinmlook zu verfälschen.

File is TAB delimited.

 

With best regards:-)

Uploaded files:
  • You need to login to have access to uploads.

Your text file is UTF8-encoded, you should recode it before searching in such a file. This can be done programmatically with a single command from the zmFunctions plugin (Peter Pavlov).
I am attaching the sample.
The plugin can be downloaded here: https://visualneo.com/forum/topic/loc-file-for-zmfunctions

Judging by the subject, you have it :)

The button in the demo project has the following code:

FileToVar "E:\test.txt" "[file]"
zmConvertString "Utf8ToAnsi" "[file]" "[TextAnsi]"
SearchStr "[TextEntry1]" "[TextAnsi]" "[pos]" ""

 

Uploaded files:
  • You need to login to have access to uploads.

@vadim

Thank you very much for your response and effort, but...[pos] value is 0 :-(, no matter what...

I add some characters BEFORE string (just in case) I'm searching for, but no changes...[pos] value is 0, not 7 (I add 7 zeroes before string)....

This is weird, isn't it ?

 

Best regards:-)

 

@dglojnar

Assuming that, following suggestion from @vadim, the file is now Ansi-coded ... and your earlier posts ...

it's a some kind of export/dump from online shop database
Some fields are multi-line

... the issue might be with the way 'new line codes' are handled in the received file and by the multi-line TextEntry Box ... in text files created with Windows (e.g. by notepad.exe), new lines are indicated by two characters (carriage-return ([#13]) and line feed ([#10]) ... in unix/linux systems, there is just one character ... as is in the Apple ecosystem (but not the same character).

So, e.g. you are searching multiple lines using [#13][#10] ... and the ansi-file contains just [#13], there will be no match i.e. [pos] = 0

I don't have the zmFunctions plugin installed on my machine (and do not desire to do so) ... assuming the plugin has a command to save the Ansi-coded text (in your variable) to a file, you can examine this file using notepad.exe ... if you see weird characters in place of new lines, it will confirm what I said about 'new line codes'.

Otherwise, post the SearchStr command you are deploying for this search (and the content of any variables used in this command).

@dglojnar

Yes, indeed! There are unicode symbols that break when recoding.
I think we need a plugin to do a search in Unicode files. But I do not know such a plugin, unfortunately.

 

@gaev, @vadim

Thank you guys for answers:-)

So, I check file with Notepad++ (as we all know he have possibilities to show all characters inside text file including CR,LF etc...)

No matter what encoding is, ANSI or UTF-8, new lines are always indicated with CR,LF ( [#13][#10] )

I try 2 different approach:

  1. Read file line by line and search for CR LF and from that position searching for string I need. BUT...position of finding CR LF doesn't match to real situation, like position of CR LF is 25 or 26-th character from beginning  - this is , of course wrong - it has to be, at least 255 or more...
  2. Parse line using actual delimiter TAB ( [#9] ) and then through the loop searching parse elements to find  string I need. And result is again  0.

So, my opinion is:

1. I made wrong approach

2. Neobook is not capable to read text file "as is" and I have to find a different solution.

Here's the code for parse approach:

FileLen "!C:\Feed\Products - Copy.txt" "[len]"

Loop "1" "[len]" "[pos]"
FileRead "!C:\Feed\Products - Copy.txt" "[pos]" "[data]"
StrParse "[data]" "[#9]" "[myLinesArray]" "[myLinesCount]"

Loop "1" "[myLinesCount]" "[thisLine]"
   SetVar "[myCurrentLine]" "[myLinesArray[thisLine]]"
   SearchStr "Ultra Move – Die besondere Materialzusammensetzung sorgt für maximale Bewegungsfreiheit ohne dabei den klassischen Deinmlook zu verfälschen.[#34]" "[myCurrentLine]" "[spos]" ""
SearchStr "[#13]" "[myCurrentLine]" "[next_pos]" ""
EndLoop

And first approach:

FileLen "!C:\Feed\Products - Copy.txt" "[len]"

Loop "1" "[len]" "[pos]"
FileRead "!C:\Feed\Products - Copy.txt" "[pos]" "[data]"
SearchStr "[#13]" "[myCurrentLine]" "[spos]" ""
SearchStr "Ultra Move – Die besondere Materialzusammensetzung sorgt für maximale Bewegungsfreiheit ohne dabei den klassischen Deinmlook zu verfälschen.[#34]" "[spos]" "[next_pos]" ""
EndLoop

Maybe I can't see a wrong logic, perhaps I write code wrong...but I just don't have any more ideas...

Question:

Is it possible to start Replace window in Notepad using SendKeys ?

And if, how?

 

With best regards:-)

@dglojnar

The problem is that VNwin cannot handle Unicode characters. For example, "ü" and "ä".
Therefore I think that no code with native commands will help. Most likely, only a plugin can solve this problem.

But maybe somebody will offer a working solution. I would be happy to know about it.

 

 

 

@dglojnar

Your 'code for parse approach:', which has a Loop inside a Loop, is missing an EndLoop ... that alone can mess up the results.

So, I check file with Notepad++ (as we all know he have possibilities to show all characters inside text file including CR,LF etc...)
No matter what encoding is, ANSI or UTF-8, new lines are always indicated with CR,LF ( [#13][#10] )

I have not used notepad++ for a long time, but if it is anything like SciTE, it would seemlessly handle text files of any persuasion ... can you post the ANSI encoded file here so we can examine its content ... and perhaps do a SearchStr for the same text that you included in your last post.

I fixed it :-)

Problem was in text file encoding, after encoding to ANSI, everything work like a charm :-)

Thank you very much Gaev and Vadim for your help, effort and time to show me the right direction :-)

 

With best regards!