Diese Seite mit anderen teilen ...

Informationen zum Thema:
Forum:
WinDev Forum
Beiträge im Thema:
26
Erster Beitrag:
vor 10 Monaten, 1 Woche
Letzter Beitrag:
vor 10 Monaten
Beteiligte Autoren:
John Fligg, Fabrice Harari, Art Bonds, DarrenF, DerekT, Al, Mr Black, Stewart Crisler, kingdr, L Jack Wilson, Sascha77

HAdd is so slow

Startbeitrag von John Fligg am 11.02.2017 15:41

I have exported a database to xml files. I now process those xml files but HADD is slowing everything down.

I have tried HAdd(WxFilename, hNoIndex + hIgnoreIntegrity) and also changed the registry setting suggested previously to prevent Windows backing up the ndx file.

I really do not know why HADD is so slow. Has anyone got any ideas please. Thx.

Antworten:

BTW I also tried HTransaction around HADD etc. and also around the entire process. Still no luck.

von John Fligg - am 11.02.2017 15:43
Hi John,

What is your target database? Are you using odbc or ole?

von DarrenF - am 11.02.2017 15:57
Just a normal HF file. No ODBC or anything else.

von John Fligg - am 11.02.2017 16:56
Hoping this will allow replies to be sent to my email address when offline.

von John Fligg - am 11.02.2017 17:08
Maybe HWrite ?

http://doc.windev.com/?3044092&name=hwrite_function

Quote

Writes a record into a data file without updating the indexes corresponding to all the keys used in the data file.


...jack

von L Jack Wilson - am 11.02.2017 20:28
Hello Jonathon

Presumably you are running through a number of xml files and adding them into the HF file. Try running the profiler on the process to find the precise time for each part of the operation.

Regards
Al

von Al - am 11.02.2017 21:29
Hi John,

have you tried HImportXML?

http://doc.windev.com/?3044007&lang=en-US&productversion=xxA210059n

I have never used it so i can´t say anything about performance but it´s might worth giving it a shot?

Cheers,
Sascha

von Sascha77 - am 12.02.2017 05:38
Hi

what is "slow"?
How many added records/second?
Size of the records?
Type of hard drive?

I did some testing when developing wxreplication, and I measured several hundreds of hadd per second on a quite slow hard drive (plus processing)... So no, I do not think that hadd is slow.

Best regards

von Fabrice Harari - am 12.02.2017 13:53
Yes, Fabrice... it's due to something else... it's just a matter of finding out what that "something else" is. It's difficult sometimes - I would definitely run the profiler John.

von DarrenF - am 12.02.2017 14:07
Hi

Standard 1tb HD, 32gbRAM, 6 CPU's and 600gb free HD space. The records are all variable in length.

I agree about the speed. In some server code I have had to slow WB down it is so fast and was generating duplicate data!!!

However I have changed my HReadFirst to HreadSeek, hIdentical and that has speeded things up quite a bit.

But I THINK I have the answer. The fields being imported from XML do not match the fields in the WD file. For example, in the old data I store say Title as a text field containing for example "Mrs". However in the new data file Title is stored as a GUID linking to a Title file. Hence the HReadSeek.

I have noticed that for simple tables it is very fast but for these files such as Client, Horse, where there are multiple "lookups" it is very slow. I am testing with 636 Clients and 1560 Horses. It is taking about an hour to convert the 2196 records.

Sadly the old file is from an old Clarion file format and was designed about 12 years ago so changing the Dictionary or Analysis is not an option.

However I have just devised a way to know how many records are being imported and for what file type. I know the average speed to add a record for each type (Client = 1 second, Horse = 2.3 seconds, simple files = 100 records per second!!!)

So now I can display the estimated time the conversion will finish and let the User decide what to do. I am also working on various checkboxes to limit the data import.

So I think I just have to accept it will take time.

BTW The Clarion app is obviously not Mobile whereas the Wx apps are. So I suggest that Users do NOT perform a data conversion as the Clarion app probably has so much rubbish data in it, it is better to start over with the Mobile app and familiarise themselves with that and start with a "clean" system. IOW this data conversion is not something to be used often.

von John Fligg - am 12.02.2017 14:07
BTW Fabrice, my "slowing down code" was when creating files for the first time (empty) and also Replication.

I was getting the same as you, several hundred files per second. As I say so fast some data was being duplicated.

von John Fligg - am 12.02.2017 14:11
If you load your 1560 horse records into an array you could then use an ArraySeek() for each client.
Will be a lot faster than HReadSeek() each time.

von DerekT - am 12.02.2017 14:28
an hour for 2000 records?

You definitely have a problem in your code, and it's not the hadd

Best regards

von Fabrice Harari - am 12.02.2017 14:44
And a simple trick to speed things up:

as you imported first your simple files, you have all the records to lookupo from available.

Do a full query of each of these lookup files first, then when you import the complex file, do hreadseek on the query dataset instead of the file, thus working in memory.

If the problem was coming from the hreadseek, the problem will be solved.

Best regards

von Fabrice Harari - am 12.02.2017 14:46
I'm almost certain it is the associated HReadSeek commands. I will run it through the Profiler now I have speeded things up a bit.

Reading into an array etc. could be really messy and troublesome. In my Horse file, I have about 20 places to convert a text filed to a GUID value to link to a lookup table. The Client file has about 15. I also have Inventory which has about 7. Hence I am pretty sure the slow down.

I just want to get it working and got it to a manageable speed for now. A complete rewrite is on the cards when I get a few moments.

von John Fligg - am 12.02.2017 14:55
BTW I am using Capesofts templates to export the files to XML in the first place. I use the default settings otherwise I could have made the XML format far different in terms of data.

Unfortunately I could never find out the right embed point to customize it.

von John Fligg - am 12.02.2017 14:57
Hi

Pls see below code to create HF table without any Analysis File from
xml to Fic as HImportXML needs HOpenAnalysis to proceed, and
just use your ONE xml file to ONE fic for testing out to see
if your computer is slow indeed or something wrong in your coding.

XMLDoc is string="XML"
XMLInfo is string
sXmlTable is string = "C:\1.xml"
sTableName is string
sDir is string=completeDir("c:\wd7\7") //Make your own DIR
sFile is string="dTable"

HCancelDeclaration(dTable)
TableDesc is file description
ItemDesc is item description
dTable is data source
fDelete(sFile+".fic")
fDelete(sFile+".ndx")
hDuplicateKey is int = 2062 //2061 for hUniqueKey
nSize is int = 40

// Description of the "dTable" file
TableDesc..Name = "dTable"
TableDesc..Type = hFileNormal
TableDesc..FicCryptMethod = hCryptStandard


XMLClose(XMLDoc) //Frees the XML document

// Load the XML file in a string
XMLInfo = fLoadText(sXmlTable)
// Initialize the XML functions on this file
XMLDocument(XMLDoc,XMLInfo)
// point to the root
XMLRoot(XMLDoc)
XMLFind(XMLDoc, null)
sTableName = XMLElementName(XMLDoc)
//trace("ParentName"+nElement+"="+ XMLElementName(XMLDoc) )

XMLChild(XMLDoc)
XMLFirst(XMLDoc)

//trace("elementName"+nElement+"="+ XMLElementName(XMLDoc) )
ItemDesc..Name = XMLElementName(XMLDoc)
ItemDesc..Type = hItemText
ItemDesc..Size = nSize
ItemDesc..KeyType = hDuplicateKey

HDescribeItem ( TableDesc , ItemDesc )

nElement is int = 1
XMLNext(XMLDoc)
WHILE NOT XMLOut(XMLDoc)
nElement = nElement + 1
ItemDesc..Name = XMLElementName(XMLDoc)
ItemDesc..Type = hItemText
ItemDesc..Size = nSize
HDescribeItem ( TableDesc , ItemDesc )
// trace("elementName"+nElement+"="+ XMLElementName(XMLDoc) )
XMLNext(XMLDoc)
END

HDeclareExternal(sDir+sFile+".fic", sFile)
HDescribeFile ( TableDesc )
HImportXML ( dTable , sXmlTable , hImpCreation )
//Make a DUMMY Memory table, a few columns will do, mine is a 3-column table
OpenChild("t=c:\wd7\wdw\t03.wdw")

BuildBrowsingTable("t.tbl", sFile,taFillTable) //t.tbl WDWname.TableName
HCancelDeclaration(sFile)
HClose(dTable)

//trace("Total # of Elements in "+ sTableName + " = " + nElement)
// Cancels the search for the other XML functions used thereafter

XMLCancelSearch(XMLDoc)
XMLClose(XMLDoc) //Frees the XML document

HTH

King

von kingdr - am 12.02.2017 15:24
Many thanks for that. However it still does not resolve the problem sadly.

In my Clarion file I have Field.Title = "Mrs." but in my WD file the Title is a GUID (WDField.Title = "ABC1928938"

So having imported all the Titles and other lookup tables (which takes about I nanosecond ) I still have to use HReadSeek to convert Title to a known GUID.

The whole problem is the HReadSeek. Nothing else.

von John Fligg - am 12.02.2017 15:32
John, how are you using hReadSeek to convert the Mrs to ABC1928938? Maybe posting a code snippet would help.

von Art Bonds - am 12.02.2017 17:37
IF XMLExtractString(XMLSourceNew, "TITLE", i) "" THEN
IF HReadSeek(Title, Description, XMLExtractString(XMLSourceNew, "TITLE", i), hIdentical) = True THEN
{WxFilename + ".TitleGUID"} = Title.GUID
END
END

von John Fligg - am 12.02.2017 18:48
Hello John

You are performing the XMLExtractString(XMLSourceNew, "TITLE", i) twice.
Once as a test and then again in the Hreadseek()
It would be quicker to store the value to a string, test the string "" and then use the string value in the hreadseek()

You also don't need the "= true" at the end of the HreadSeek() but I don't know if that has any affect on the speed.

Also, HreadSeekFirst() may process differently to HReadSeek() with hidentical

Regards
Al

von Al - am 12.02.2017 19:02
I agree with Al regarding both the XML and hReadSeek vs hReadSeekFirst. In the Help for hRS I see this:

"By default, HReadSeekFirst and HReadSeekLast are used to perform exact-match searches."

"If you are using new data files in HFSQL Classic format:
In the data model editor, the "Classic Mode - Text items ended by a binary zero character"\0"" option is automatically checked.
To perform an exact-match search, you must use:
...the function HReadSeekFirst.
...HReadSeek and add Charact(0) at the end of the sought value."

I would rewrite the XML stuff per Al, and use hReadSeekFirst.

von Art Bonds - am 12.02.2017 19:52
John
At the risk of repeating myself I still think that arrays will speed things up.

FOR i = 1 TO ???????
lsTitle = XMLExtractString(XMLSourceNew, "TITLE", i)

IF lsTitle "" THEN
lnSeekRes = ArraySeek(ArrTITLE,asLinear,"Description",lsTitle)
IF lnSeekRes > 0 THEN
{WxFilename + ".TitleGUID"} = ArrTitle[lnSeekRes].GUID
END
END
END



With the TITLE data being contained in an array then HReadSeek() is redundant.
All seeks are now carried out in memory so will be blindingly quick.

Unless of course I haver missed something in here along the way.

von DerekT - am 12.02.2017 19:52
You maybe able to get some performance improvement by using HCreateView and HExecuteView. Assuming the file you are doing HReadSeek against is not to large for available memory, these commands load a copy of the file into memory, then H commands work against the copy in memory instead of being dependant on IO to disk.

von Mr Black - am 12.02.2017 21:33
I learned from VFP that if something is not fast enough, the fist place to look are the file indexes. Perhaps one or more lookups is not indexed correctly or the index needs rebuilding.

Stewart Crisler

von Stewart Crisler - am 13.02.2017 20:35
Zur Information:
MySnip.de hat keinen Einfluss auf die Inhalte der Beiträge. Bitte kontaktieren Sie den Administrator des Forums bei Problemen oder Löschforderungen über die Kontaktseite.
Falls die Kontaktaufnahme mit dem Administrator des Forums fehlschlägt, kontaktieren Sie uns bitte über die in unserem Impressum angegebenen Daten.