PHP Sitemap - Scan problem

Started by txmodxoops, June 02, 2016, 12:07:51 PM

Previous topic - Next topic

txmodxoops

Hi

I have this problem!

If when scanning the url ends with / index.php instead, it is going to create the same line another url with two single quotes, another different url with including index.php

Example:
http://www.mysite.org/modules/mymodule/'http://www.mysite.org/index.php'

Thanks for your wonderful work :)

Elmar

Hello,

did you try the new version?

Regards Elmar

txmodxoops

Hello,

Yes, it's the same result of the previous versions.

Elmar

Please send me the URL of your Website. If you want as private message. I will check it in the next days.


txmodxoops

I want to alert you that define the left have the quotes (Windows OS)

As php manual: http://php.net/manual/en/function.define.php

Elmar


txmodxoops

#7
In my error-log, there are many warning

defines that are bottom it's advisable to place them at the top

Elmar

I tried the latest Sitemap version 2.0-test1 with your page and it worked without problems.

txmodxoops

I receive these Notice:

Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168
Notice: Undefined index: scheme in the line /sitemap_2.0-test1.php file 163
Notice: Undefined index: in /sitemap_2.0-test1.php host file to the line 168

txmodxoops

Quote from: Elmar on June 06, 2016, 10:03:52 AM
I tried the latest Sitemap version 2.0-test1 with your page and it worked without problems.

Compared to the previous version does not compile all the links properly, there are many more

Where do I send the completed files for showing them to you?

Thanks...!

Elmar


Elmar

Send me a private message with the file attached.

The notice messages can be ignored. They will be removed in 2.0-test2.

txmodxoops

Php version 5.4

This attached file is with version 2.0 and work fine, but don't create all links

Why this?

txmodxoops

#14
This file is with version 1.0 and you can see the problem!

Don't have an account on GitHub?

Elmar

Forget the 1.0 version. Only use 2.0 (test).

Give me some examples what link on what page is missing.

txmodxoops

#16
I send a file to show you my result of version 1.0, the version 2.0 work but not for all links!

I would like to enter your code in xoops module turning it into a class xoops ...!

You let me?

Thank you

Elmar

A bug fixed version is available for download.

Best regards
Elmar

txmodxoops


txmodxoops

Results of the new version 2.0 test 2

There is a little bug &

Thanks again!

Elmar

Hello,

no, thats not a bug. Its correct encoded for XML.

The '&' is a reserved char in XML and has to be encoded to &
The link on your page is for example  ...../mylinks/singlelink.php?cid=1&lid=12
The '&' has to be encoded and the result for XML is ...../mylinks/singlelink.php?cid=1&lid=12
When the sitemap crawler reads the XML file, then it decodes the '&' back to '&'  and the URL is the same as on your page.


Btw, I think that the ...../mylinks/singlelink.php?cid=1&lid=12 URLs are generated by a module of your site. The browser will translate it to ...../mylinks/singlelink.php?cid=1&lid=12, so its not a problem.


View with your browser the source code of the index page of your site and search for   singlelink.php?cid=1&lid=12   and you will find it.


Regards
Elmar

txmodxoops

#21
Yes, I know what you mean,

but my request was to create a code that eliminates any duplicates of & or amp;

I have this:
$link = 'page.php?get=' . urlencode($valore);
echo '<a href="' . htmlentities($link) . '">Here</a>'; 
,

do you think is appropriate to write it in the xoops code modules, or you can put it in your code?

Regards

Elmar

My script takes the links as they are in your website code. You have to clean your website code when you don't want the duplicates.

txmodxoops

I have not yet figured out why is only the last url-articles, and not from first to last of the articles ...

it is a problem of access to all the links?

Elmar

#24
Quote from: txmodxoops on June 26, 2016, 15:08:38 PM
do you think is appropriate to write it in the xoops code modules, or you can put it in your code?

The clean way is to fix it in your code.


You will end with the &amp; in your html code as long as you use htmlentities for the URL that has an &. I dont know why you use htmlentities for the anchor.

Elmar

#25
You should use htmlentities text output and so on, but not for an anchor.

txmodxoops

I agree why I have not solved the problem.