Compare commits
No commits in common. "master" and "master" have entirely different histories.
|
@ -39,9 +39,9 @@ tl;dr the best way to actually browse for shit.
|
|||
| Qwant | Qwant | Startpage | Mojeek | | Kagi |
|
||||
| Ghostery | Yep | Qwant | | | Qwant |
|
||||
| Yep | Solofield | Solofield | | | Ghostery |
|
||||
| Greppr | Pinterest | | | | Yep |
|
||||
| Crowdview | Imgur | | | | Marginalia |
|
||||
| Mwmbl | FindThatMeme | | | | YouTube |
|
||||
| Greppr | Imgur | | | | Yep |
|
||||
| Crowdview | FindThatMeme | | | | Marginalia |
|
||||
| Mwmbl | | | | | YouTube |
|
||||
| Mojeek | | | | | Soundcloud |
|
||||
| Solofield | | | | | |
|
||||
| Marginalia | | | | | |
|
||||
|
|
21
api.txt
21
api.txt
|
@ -1,17 +1,10 @@
|
|||
44
|
||||
4444444 44
|
||||
44444444 44444 444
|
||||
44444444 444444 444444444
|
||||
44444 44444444 444444444
|
||||
444444444 4444444
|
||||
4444444444 444444
|
||||
4444444444444
|
||||
444444444444444444
|
||||
444444444444444
|
||||
44444444
|
||||
4444
|
||||
44
|
||||
|
||||
__ __ __
|
||||
/ // / ____ ____ / /_
|
||||
/ // /_/ __ `/ _ \/ __/
|
||||
/__ __/ /_/ / __/ /_
|
||||
/_/ \__, /\___/\__/
|
||||
/____/
|
||||
|
||||
+ Welcome to the 4get API documentation +
|
||||
|
||||
+ Terms of use
|
||||
|
|
|
@ -119,7 +119,7 @@ class config{
|
|||
|
||||
// Default user agent to use for scraper requests. Sometimes ignored to get specific webpages
|
||||
// Changing this might break things.
|
||||
const USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:133.0) Gecko/20100101 Firefox/133.0";
|
||||
const USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:129.0) Gecko/20100101 Firefox/129.0";
|
||||
|
||||
// Proxy pool assignments for each scraper
|
||||
// false = Use server's raw IP
|
||||
|
@ -129,7 +129,6 @@ class config{
|
|||
const PROXY_BRAVE = false;
|
||||
const PROXY_FB = false; // facebook
|
||||
const PROXY_GOOGLE = false;
|
||||
const PROXY_GOOGLE_CSE = false;
|
||||
const PROXY_STARTPAGE = false;
|
||||
const PROXY_QWANT = false;
|
||||
const PROXY_GHOSTERY = false;
|
||||
|
@ -158,9 +157,6 @@ class config{
|
|||
// Scraper-specific parameters
|
||||
//
|
||||
|
||||
// GOOGLE CSE
|
||||
const GOOGLE_CX_ENDPOINT = "d4e68b99b876541f0";
|
||||
|
||||
// MARGINALIA
|
||||
// Use "null" to default out to HTML scraping OR specify a string to
|
||||
// use the API (Eg: "public"). API has less filters.
|
||||
|
|
271
docs/nginx.md
271
docs/nginx.md
|
@ -1,194 +1,103 @@
|
|||
<h1 align=center>Installation of 4get in NGINX</h1>
|
||||
# Install on NGINX
|
||||
|
||||
<div align=right>
|
||||
>I do NOT recommend following this guide, only follow this if you *really* need to use nginx. I recommend you use the apache2 steps instead.
|
||||
|
||||
> NOTE: As the previous version stated, it is better to follow the <a href="https://git.lolcat.ca/lolcat/4get/src/branch/master/docs/apache2.md">Apache2 guide</a> instead of the Nginx one.
|
||||
Login as root.
|
||||
|
||||
> NOTE: This is going to guess that you're using either a <abbr title="(Arch Linux, Artix Linux, Endeavouros, etc...) ">Arch-based system</abbr> or a <abbr title="(Debian, Ubuntu, Devuan, etc...)">Debian-based system</abbr>, although you can still follow it with minor issues.
|
||||
Create a file in `/etc/nginx/sites-avaliable/` called `4get.conf` or any name you want and put this into the file:
|
||||
|
||||
</div>
|
||||
|
||||
1. Login as root.
|
||||
2. Upgrade your system:
|
||||
* On Arch-based, run `pacman -Syu`.
|
||||
* On Debian-based, run `apt update`, then `apt upgrade`.
|
||||
3. Install the following dependencies:
|
||||
* `git`: So you can clone <a href="https://git.lolcat.ca/lolcat/4get">this</a> repository.
|
||||
* `nginx`: So you can run Nginx.
|
||||
* `php-fpm`: This is what allows Nginx to run *(and show)* PHP files.
|
||||
* `php-imagick`, `imagemagick`: Image manipulation.
|
||||
* `php-apcu`: Caching module.
|
||||
* `php-curl`, `curl`: Transferring data with URLs.
|
||||
* `php-mbstring`: String utils.
|
||||
* `certbot`, `certbot-nginx`: ACME client. Used to create SSL certificates.
|
||||
* In Arch-based distributions:
|
||||
* `pacman -S nginx certbot php-imagick certbot-nginx imagemagick curl php-apcu git`
|
||||
* In Debian-based distributions:
|
||||
* `apt install php-mbstring nginx certbot-nginx certbot php-imagick imagemagick php-curl curl php-apcu git`
|
||||
|
||||
<div align=right>
|
||||
|
||||
> IMPORTANT: `php-curl`, `php-mbstring` might be a Debian-only package, but this needs further fact checking.
|
||||
|
||||
> IMPORTANT: If having issues with `php-apcu` or `libsodium`, go to [^1].
|
||||
|
||||
</div>
|
||||
|
||||
4. `cd` to `/etc/nginx` and make the `conf.d/` directory if it doesn't exist:
|
||||
* Again, this guesses you're logged in as root.
|
||||
```sh
|
||||
cd /etc/nginx
|
||||
ls -l conf.d/ # If ls shows conf.d, then it means it exists.
|
||||
# If it does not, run:
|
||||
mkdir conf.d
|
||||
```
|
||||
5. Make a file inside `conf.d/` called `4get.conf` and place the following content:
|
||||
* First run `touch conf.d/4get.conf` then `nano conf.d/4get.conf` to open the nano editor: *(Install it if it is not, or use another editor.)*
|
||||
```sh
|
||||
server {
|
||||
access_log /dev/null; # Search log file. Do you really need to?
|
||||
error_log /dev/null; # Error log file.
|
||||
|
||||
# Change this if you have 4get in another folder.
|
||||
root /var/www/4get;
|
||||
# Change 'yourdomain' to your domain.
|
||||
server_name www.yourdomain.com yourdomain.com;
|
||||
# Port to listen to.
|
||||
listen 80;
|
||||
|
||||
location @php {
|
||||
try_files $uri.php $uri/index.php =404;
|
||||
# Change the unix socket address if it's different for you.
|
||||
fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;
|
||||
fastcgi_index index.php;
|
||||
# Change this to `fastcgi_params` if you use a debian based distribution.
|
||||
include fastcgi.conf;
|
||||
fastcgi_intercept_errors on;
|
||||
}
|
||||
|
||||
location / {
|
||||
try_files $uri @php;
|
||||
}
|
||||
|
||||
location ~* ^(.*)\.php$ {
|
||||
return 301 $1;
|
||||
}
|
||||
```
|
||||
server {
|
||||
# DO YOU REALLY NEED TO LOG SEARCHES?
|
||||
access_log /dev/null;
|
||||
error_log /dev/null;
|
||||
# Change this if you have 4get in other folder.
|
||||
root /var/www/4get;
|
||||
# Change yourdomain by your domain lol
|
||||
server_name www.yourdomain.com yourdomain.com;
|
||||
|
||||
location @php {
|
||||
try_files $uri.php $uri/index.php =404;
|
||||
# Change the unix socket address if it's different for you.
|
||||
fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;
|
||||
fastcgi_index index.php;
|
||||
# Change this to `fastcgi_params` if you use a debian based distro.
|
||||
include fastcgi.conf;
|
||||
fastcgi_intercept_errors on;
|
||||
}
|
||||
```
|
||||
* The above is a very basic configuration and thus will need tweaking to your personal needs. It should still work as-is, though. A 'real world' example is present in [^2].
|
||||
* After saving the file, check that the `nginx.conf` file inside the main directory includes files inside `conf.d/`:
|
||||
* It should be inside the the http block: *(The following is an example! Don't just Copy and Paste it!)*
|
||||
```sh
|
||||
http {
|
||||
include mime.types;
|
||||
include conf.d/*.conf;
|
||||
types_hash_max_size 4096;
|
||||
# ...
|
||||
}
|
||||
```
|
||||
* Now, test your configuration with `nginx -t`, if it says that everything is good, restart *(or start)* the Nginx daemon:
|
||||
* This depends on the init manager, most distributions use `systemd`, but it's better practice to include most.
|
||||
```sh
|
||||
# systemd
|
||||
systemctl stop nginx
|
||||
systemctl start nginxt
|
||||
# or
|
||||
systemctl restart nginx
|
||||
|
||||
# openrc
|
||||
rc-service nginx stop
|
||||
rc-service nginx start
|
||||
# or
|
||||
rc-service nginx restart
|
||||
|
||||
# runit
|
||||
sv down nginx
|
||||
sv up nginx
|
||||
# or
|
||||
sv restart nginx
|
||||
|
||||
# s6
|
||||
s6-rc -d change nginx
|
||||
s6-rc -u change nginx
|
||||
# or
|
||||
s6-svc -r /run/service/nginx
|
||||
|
||||
# dinit
|
||||
dinitctl stop nginx
|
||||
dinitctl start nginx
|
||||
# or
|
||||
dinitctl restart nginx
|
||||
```
|
||||
6. Clone the repository to `/var/www`:
|
||||
* `git clone --depth 1 https://git.lolcat.ca/lolcat/4get 4get` - It clones the repository with the depth of one commit *(so it takes less time to download)* and saves the cloned repository as '4get'.
|
||||
7. That should be it! There are some extra steps you can take, but it really just depends on you.
|
||||
|
||||
<h2 align=center>Encryption setup</h2>
|
||||
|
||||
1. Generate a certificate for the domain you're using with:
|
||||
* Note that `certbot-nginx` is needed.
|
||||
```sh
|
||||
certbot --nginx --key-type ecdsa -d www.yourdomain.com -d yourdomain.com
|
||||
```
|
||||
2. After that, certbot will deploy the certificate automatically to your 4get conf file; It should be ready to use from there.
|
||||
|
||||
<h2 align=center>Tor Setup</h2>
|
||||
|
||||
<div align=right>
|
||||
|
||||
> IMPORTANT: Tor onion addresses are very long compared to traditional domains, so, Before doing anything, edit `nginx.conf` and increase <abbr title="This setting in your Nginx configuration controls the internal data structure used to manage multiple server names (hostnames) associated with your web server. Each hostname requires a certain amount of memory within this structure. If the size is insufficient, Nginx will encounter errors."><code>server_names_hash_bucket_size</code></abbr> to your needs.
|
||||
|
||||
</div>
|
||||
|
||||
1. `cd` to `/etc/nginx` *(if you haven't)* and open your `nginx.conf` file.
|
||||
2. Find the line containing `# server_names_hash_bucket_size 64;` inside said file.
|
||||
3. Uncomment the line and adjust the value; start with 64, but if you encounter issues, incrementally increase it *(e.g., 128, 256)* until it accommodates your configuration.
|
||||
4. Open *(or duplicate the configuration)* and edit it:
|
||||
* Example configuration, again:
|
||||
```sh
|
||||
server {
|
||||
access_log /dev/null; # Search log file. Do you really need to?
|
||||
error_log /dev/null; # Error log file.
|
||||
|
||||
# Change this if you have 4get in another folder.
|
||||
root /var/www/4get;
|
||||
# Change 'onionadress.onion' to your onion link.
|
||||
server_name onionadress.onion;
|
||||
# Port to listen to.
|
||||
listen 80;
|
||||
|
||||
location @php {
|
||||
try_files $uri.php $uri/index.php =404;
|
||||
# Change the unix socket address if it's different for you.
|
||||
fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;
|
||||
fastcgi_index index.php;
|
||||
# Change this to `fastcgi_params` if you use a debian based distribution.
|
||||
include fastcgi.conf;
|
||||
fastcgi_intercept_errors on;
|
||||
}
|
||||
|
||||
location / {
|
||||
try_files $uri @php;
|
||||
}
|
||||
|
||||
location ~* ^(.*)\.php$ {
|
||||
return 301 $1;
|
||||
}
|
||||
|
||||
location / {
|
||||
try_files $uri @php;
|
||||
}
|
||||
```
|
||||
A real world example is present in [^2].
|
||||
5. Once done, check the configuration with `nginx -t`. If everything's fine and dandy, refer to <a href="https://git.lolcat.ca/lolcat/4get/src/branch/master/docs/tor.md">the Tor guide</a> to setup your onion site.
|
||||
|
||||
<h2 align=center>Other important things</h2>
|
||||
location ~* ^(.*)\.php$ {
|
||||
return 301 $1;
|
||||
}
|
||||
|
||||
1. <a href="https://git.lolcat.ca/lolcat/4get/src/branch/master/docs/configure.md">Configuration guide</a>: Things to do after setup.
|
||||
2. <a href="https://git.lolcat.ca/lolcat/4get/src/branch/master/docs/apache2.md">Apache2 guide</a>: Fallback to this if you couldn't get something to work, or you don't know something.
|
||||
listen 80;
|
||||
}
|
||||
```
|
||||
|
||||
<h2 align=center>Known issues</h2>
|
||||
That is a very basic config so you will need to adapt it to your needs in case you have a more complicated nginx configuration. Anyways, you can see a real world example [here](https://git.zzls.xyz/Fijxu/etc-configs/src/branch/selfhost/nginx/sites-available/4get.zzls.xyz.conf)
|
||||
|
||||
1. https://git.lolcat.ca/lolcat/4get/issues
|
||||
After you save the file you will need to do a symlink of the `4get.conf` file to `/etc/nignx/sites-enabled/`, you can do it with this command:
|
||||
|
||||
[^1]: lolcat/4get#40, If having issues with `libsodium`, or `php-apcu`.
|
||||
[^2]: <a href="https://git.nadeko.net/Fijxu/etc-configs/src/branch/selfhost/nginx/conf.d/4get.conf">git.nadeko.net</a> nadeko.net's 4get instance configuration.
|
||||
```sh
|
||||
ln -s /etc/nginx/sites-available/4get.conf /etc/nginx/sites-available/4get.conf
|
||||
```
|
||||
|
||||
Now test the nginx config with `nginx -t`, if it says that everything is good, restart nginx using `systemctl restart nginx`
|
||||
|
||||
# Encryption setup
|
||||
|
||||
Generate a certificate for the domain using:
|
||||
|
||||
```sh
|
||||
certbot --nginx --key-type ecdsa -d www.yourdomain.com -d yourdomain.com
|
||||
```
|
||||
(Remember to install the nginx certbot plugin!!!)
|
||||
|
||||
After doing that certbot should deploy the certificate automatically into your 4get nginx config file. It should be ready to use at that point.
|
||||
|
||||
# Tor setup on NGINX
|
||||
|
||||
Important Note: Tor onion addresses are significantly longer than traditional domain names. Before proceeding with Nginx configuration, ensure you increase the `server_names_hash_bucket_size` value in your `nginx.conf` file. This setting in your Nginx configuration controls the internal data structure used to manage multiple server names (hostnames) associated with your web server. Each hostname requires a certain amount of memory within this structure. If the size is insufficient, Nginx will encounter errors.
|
||||
|
||||
1. Open your `nginx.conf` file (that is under `/etc/nginx/nginx.conf`).
|
||||
2. Find the line containing `# server_names_hash_bucket_size 64;`.
|
||||
3. Uncomment the line and adjust the value. Start with 64, but if you encounter issues, incrementally increase it (e.g., 128, 256) until it accommodates your configuration.
|
||||
|
||||
Open your current 4get NGINX config (that is under `/etc/nginx/sites-available/`) and append this to the end of the file:
|
||||
|
||||
```
|
||||
server {
|
||||
access_log /dev/null;
|
||||
error_log /dev/null;
|
||||
|
||||
listen 80;
|
||||
server_name <youronionaddress>;
|
||||
root /var/www/4get;
|
||||
|
||||
location @php {
|
||||
try_files $uri.php $uri/index.php =404;
|
||||
# Change the unix socket address if it's different for you.
|
||||
fastcgi_pass unix:/var/run/php-fpm/php-fpm.sock;
|
||||
fastcgi_index index.php;
|
||||
# Change this to `fastcgi_params` if you use a debian based distro.
|
||||
include fastcgi.conf;
|
||||
fastcgi_intercept_errors on;
|
||||
}
|
||||
|
||||
location / {
|
||||
try_files $uri @php;
|
||||
}
|
||||
|
||||
location ~* ^(.*)\.php$ {
|
||||
return 301 $1;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Obviously replace `<youronionaddress>` by the onion address of `/var/lib/tor/4get/hostname` and then check if the nginx config is valid with `nginx -t` if yes, then restart the nginx service and try opening the onion address into the Tor Browser. You can see a real world example [here](https://git.zzls.xyz/Fijxu/etc-configs/src/branch/selfhost/nginx/sites-available/4get.zzls.xyz.conf)
|
||||
|
||||
Once you did the above, refer to <a href="https://git.lolcat.ca/lolcat/4get/src/branch/master/docs/tor.md">this tor guide</a> to setup your onionsite.
|
||||
|
|
|
@ -75,7 +75,6 @@ class backend{
|
|||
break;
|
||||
|
||||
case "socks5_hostname":
|
||||
case "socks5h":
|
||||
case "socks5a":
|
||||
curl_setopt($curlproc, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME);
|
||||
curl_setopt($curlproc, CURLOPT_PROXY, $address . ":" . $port);
|
||||
|
|
|
@ -838,10 +838,10 @@ class frontend{
|
|||
}
|
||||
|
||||
$payload .=
|
||||
'<a href="https://webcache.googleusercontent.com/search?q=cache:' . $urlencode . '" class="list" target="_BLANK"><img src="/favicon?s=https://google.com" alt="go">Google cache</a>' .
|
||||
'<a href="https://web.archive.org/web/' . $urlencode . '" class="list" target="_BLANK"><img src="/favicon?s=https://archive.org" alt="ar">Archive.org</a>' .
|
||||
'<a href="https://archive.ph/newest/' . htmlspecialchars($link) . '" class="list" target="_BLANK"><img src="/favicon?s=https://archive.is" alt="ar">Archive.is</a>' .
|
||||
'<a href="https://ghostarchive.org/search?term=' . $urlencode . '" class="list" target="_BLANK"><img src="/favicon?s=https://ghostarchive.org" alt="gh">Ghostarchive</a>' .
|
||||
'<a href="https://arquivo.pt/wayback/' . htmlspecialchars($link) . '" class="list" target="_BLANK"><img src="/favicon?s=https://arquivo.pt" alt="ar">Arquivo.pt</a>' .
|
||||
'<a href="https://www.bing.com/search?q=url%3A' . $urlencode . '" class="list" target="_BLANK"><img src="/favicon?s=https://bing.com" alt="bi">Bing cache</a>' .
|
||||
'<a href="https://megalodon.jp/?url=' . $urlencode . '" class="list" target="_BLANK"><img src="/favicon?s=https://megalodon.jp" alt="me">Megalodon</a>' .
|
||||
'</div>';
|
||||
|
@ -939,7 +939,6 @@ class frontend{
|
|||
"brave" => "Brave",
|
||||
"yandex" => "Yandex",
|
||||
"google" => "Google",
|
||||
"google_cse" => "Google CSE",
|
||||
"startpage" => "Startpage",
|
||||
"qwant" => "Qwant",
|
||||
"ghostery" => "Ghostery",
|
||||
|
@ -964,12 +963,11 @@ class frontend{
|
|||
"yandex" => "Yandex",
|
||||
"brave" => "Brave",
|
||||
"google" => "Google",
|
||||
"google_cse" => "Google CSE",
|
||||
"startpage" => "Startpage",
|
||||
"qwant" => "Qwant",
|
||||
"yep" => "Yep",
|
||||
"solofield" => "Solofield",
|
||||
"pinterest" => "Pinterest",
|
||||
//"pinterest" => "Pinterest",
|
||||
"imgur" => "Imgur",
|
||||
"ftm" => "FindThatMeme"
|
||||
]
|
||||
|
|
109
lib/fuckhtml.php
109
lib/fuckhtml.php
|
@ -381,8 +381,6 @@ class fuckhtml{
|
|||
$json_out = null;
|
||||
$last_char = null;
|
||||
|
||||
$keyword_check = null;
|
||||
|
||||
for($i=0; $i<strlen($json); $i++){
|
||||
|
||||
switch($json[$i]){
|
||||
|
@ -398,7 +396,6 @@ class fuckhtml{
|
|||
|
||||
$bracket = false;
|
||||
$is_close_bracket = true;
|
||||
|
||||
}else{
|
||||
|
||||
if($bracket === false){
|
||||
|
@ -432,31 +429,6 @@ class fuckhtml{
|
|||
$is_close_bracket === false
|
||||
){
|
||||
|
||||
// do keyword check
|
||||
$keyword_check .= $json[$i];
|
||||
|
||||
if(in_array($json[$i], [":", "{"])){
|
||||
|
||||
$keyword_check = substr($keyword_check, 0, -1);
|
||||
|
||||
if(
|
||||
preg_match(
|
||||
'/function|array|return/i',
|
||||
$keyword_check
|
||||
)
|
||||
){
|
||||
|
||||
$json_out =
|
||||
preg_replace(
|
||||
'/[{"]*' . preg_quote($keyword_check, "/") . '$/',
|
||||
"",
|
||||
$json_out
|
||||
);
|
||||
}
|
||||
|
||||
$keyword_check = null;
|
||||
}
|
||||
|
||||
// here we know we're not iterating over a quoted string
|
||||
switch($json[$i]){
|
||||
|
||||
|
@ -526,85 +498,4 @@ class fuckhtml{
|
|||
$string
|
||||
);
|
||||
}
|
||||
|
||||
public function extract_json($json){
|
||||
|
||||
$len = strlen($json);
|
||||
$array_level = 0;
|
||||
$object_level = 0;
|
||||
$in_quote = null;
|
||||
$start = null;
|
||||
|
||||
for($i=0; $i<$len; $i++){
|
||||
|
||||
switch($json[$i]){
|
||||
|
||||
case "[":
|
||||
if($in_quote === null){
|
||||
|
||||
$array_level++;
|
||||
if($start === null){
|
||||
|
||||
$start = $i;
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case "]":
|
||||
if($in_quote === null){
|
||||
|
||||
$array_level--;
|
||||
}
|
||||
break;
|
||||
|
||||
case "{":
|
||||
if($in_quote === null){
|
||||
|
||||
$object_level++;
|
||||
if($start === null){
|
||||
|
||||
$start = $i;
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case "}":
|
||||
if($in_quote === null){
|
||||
|
||||
$object_level--;
|
||||
}
|
||||
break;
|
||||
|
||||
case "\"":
|
||||
case "'":
|
||||
if(
|
||||
$i !== 0 &&
|
||||
$json[$i - 1] !== "\\"
|
||||
){
|
||||
// found a non-escaped quote
|
||||
|
||||
if($in_quote === null){
|
||||
|
||||
// open quote
|
||||
$in_quote = $json[$i];
|
||||
}elseif($in_quote === $json[$i]){
|
||||
|
||||
// close quote
|
||||
$in_quote = null;
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
if(
|
||||
$start !== null &&
|
||||
$array_level === 0 &&
|
||||
$object_level === 0
|
||||
){
|
||||
|
||||
return substr($json, $start, $i - $start + 1);
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -293,8 +293,8 @@ class brave{
|
|||
/*
|
||||
$handle = fopen("scraper/brave.html", "r");
|
||||
$html = fread($handle, filesize("scraper/brave.html"));
|
||||
fclose($handle);*/
|
||||
|
||||
fclose($handle);
|
||||
*/
|
||||
|
||||
try{
|
||||
$html =
|
||||
|
@ -410,20 +410,10 @@ class brave{
|
|||
throw new Exception("Could not grep JavaScript object");
|
||||
}
|
||||
|
||||
$data =
|
||||
rtrim(
|
||||
preg_replace(
|
||||
'/\(Array\(0\)\)\).*$/',
|
||||
"",
|
||||
$grep[1]
|
||||
),
|
||||
" ]"
|
||||
) . "]";
|
||||
|
||||
$data =
|
||||
$this->fuckhtml
|
||||
->parseJsObject(
|
||||
$data
|
||||
$grep[1]
|
||||
);
|
||||
unset($grep);
|
||||
|
||||
|
@ -673,10 +663,7 @@ class brave{
|
|||
$table["Address"] = $result["location"]["postal_address"]["displayAddress"];
|
||||
}
|
||||
|
||||
if(
|
||||
isset($result["location"]["rating"]) &&
|
||||
$result["location"]["rating"] != "void 0"
|
||||
){
|
||||
if(isset($result["location"]["rating"])){
|
||||
|
||||
$table["Rating"] =
|
||||
$result["location"]["rating"]["ratingValue"] . "/" .
|
||||
|
@ -684,19 +671,13 @@ class brave{
|
|||
number_format($result["location"]["rating"]["reviewCount"]) . " votes)";
|
||||
}
|
||||
|
||||
if(
|
||||
isset($result["location"]["contact"]["telephone"]) &&
|
||||
$result["location"]["contact"]["telephone"] != "void 0"
|
||||
){
|
||||
if(isset($result["location"]["contact"]["telephone"])){
|
||||
|
||||
$table["Phone number"] =
|
||||
$result["location"]["contact"]["telephone"];
|
||||
}
|
||||
|
||||
if(
|
||||
isset($result["location"]["price_range"]) &&
|
||||
$result["location"]["price_range"] != "void 0"
|
||||
){
|
||||
if(isset($result["location"]["price_range"])){
|
||||
|
||||
$table["Price"] =
|
||||
$result["location"]["price_range"];
|
||||
|
|
3697
scraper/ddg.php
3697
scraper/ddg.php
File diff suppressed because it is too large
Load Diff
|
@ -136,7 +136,7 @@ class ftm{
|
|||
"source" => [
|
||||
[
|
||||
"url" =>
|
||||
"https://s3.thehackerblog.com/findthatmeme/" .
|
||||
"https://findthatmeme.us-southeast-1.linodeobjects.com/" .
|
||||
$thumb,
|
||||
"width" => null,
|
||||
"height" => null
|
||||
|
|
|
@ -2531,8 +2531,6 @@ class google{
|
|||
"div"
|
||||
);
|
||||
|
||||
$date = null;
|
||||
|
||||
if(count($date_div) !== 0){
|
||||
|
||||
foreach($date_div as $div){
|
||||
|
@ -2543,7 +2541,6 @@ class google{
|
|||
"bottom:"
|
||||
) !== false
|
||||
){
|
||||
|
||||
$date =
|
||||
strtotime(
|
||||
$this->fuckhtml
|
||||
|
@ -2551,6 +2548,7 @@ class google{
|
|||
$div
|
||||
)
|
||||
);
|
||||
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
@ -4149,7 +4147,7 @@ class google{
|
|||
throw new Exception("Failed to get HTML");
|
||||
}
|
||||
|
||||
//$html = file_get_contents("scraper/google.txt");
|
||||
//$html = file_get_contents("scraper/google.html");
|
||||
|
||||
return $this->parsepage($html, "web", $search, $proxy, $params);
|
||||
}
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -220,14 +220,13 @@ class marginalia{
|
|||
"related" => []
|
||||
];
|
||||
|
||||
// API scraper
|
||||
if(config::MARGINALIA_API_KEY !== null){
|
||||
|
||||
try{
|
||||
$json =
|
||||
$this->get(
|
||||
$this->backend->get_ip(), // no nextpage
|
||||
"https://api.marginalia-search.com/" . config::MARGINALIA_API_KEY . "/search/" . urlencode($search),
|
||||
"https://api.marginalia.nu/" . config::MARGINALIA_API_KEY . "/search/" . urlencode($search),
|
||||
[
|
||||
"count" => 20
|
||||
]
|
||||
|
@ -264,57 +263,34 @@ class marginalia{
|
|||
return $out;
|
||||
}
|
||||
|
||||
// HTML parser
|
||||
$proxy = $this->backend->get_ip();
|
||||
// no more cloudflare!! Parse html by default
|
||||
$params = [
|
||||
"query" => $search
|
||||
];
|
||||
|
||||
if($get["npt"]){
|
||||
foreach(["adtech", "recent", "intitle"] as $v){
|
||||
|
||||
[$params, $proxy] =
|
||||
$this->backend->get(
|
||||
$get["npt"],
|
||||
"web"
|
||||
);
|
||||
|
||||
try{
|
||||
$html =
|
||||
$this->get(
|
||||
$proxy,
|
||||
"https://old-search.marginalia.nu/search?" . $params
|
||||
);
|
||||
}catch(Exception $error){
|
||||
if($get[$v] == "yes"){
|
||||
|
||||
throw new Exception("Failed to get HTML");
|
||||
}
|
||||
|
||||
}else{
|
||||
$params = [
|
||||
"query" => $search
|
||||
];
|
||||
|
||||
foreach(["adtech", "recent", "intitle"] as $v){
|
||||
|
||||
if($get[$v] == "yes"){
|
||||
switch($v){
|
||||
|
||||
switch($v){
|
||||
|
||||
case "adtech": $params["adtech"] = "reduce"; break;
|
||||
case "recent": $params["recent"] = "recent"; break;
|
||||
case "adtech": $params["searchTitle"] = "title"; break;
|
||||
}
|
||||
case "adtech": $params["adtech"] = "reduce"; break;
|
||||
case "recent": $params["recent"] = "recent"; break;
|
||||
case "adtech": $params["searchTitle"] = "title"; break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
try{
|
||||
$html =
|
||||
$this->get(
|
||||
$this->backend->get_ip(),
|
||||
"https://search.marginalia.nu/search",
|
||||
$params
|
||||
);
|
||||
}catch(Exception $error){
|
||||
|
||||
try{
|
||||
$html =
|
||||
$this->get(
|
||||
$proxy,
|
||||
"https://old-search.marginalia.nu/search",
|
||||
$params
|
||||
);
|
||||
}catch(Exception $error){
|
||||
|
||||
throw new Exception("Failed to get HTML");
|
||||
}
|
||||
throw new Exception("Failed to get HTML");
|
||||
}
|
||||
|
||||
$this->fuckhtml->load($html);
|
||||
|
@ -411,65 +387,6 @@ class marginalia{
|
|||
];
|
||||
}
|
||||
|
||||
// get next page
|
||||
$this->fuckhtml->load($html);
|
||||
|
||||
$pagination =
|
||||
$this->fuckhtml
|
||||
->getElementsByAttributeValue(
|
||||
"aria-label",
|
||||
"pagination",
|
||||
"nav"
|
||||
);
|
||||
|
||||
if(count($pagination) === 0){
|
||||
|
||||
// no pagination
|
||||
return $out;
|
||||
}
|
||||
|
||||
$this->fuckhtml->load($pagination[0]);
|
||||
|
||||
$pages =
|
||||
$this->fuckhtml
|
||||
->getElementsByClassName(
|
||||
"page-link",
|
||||
"a"
|
||||
);
|
||||
|
||||
$found_current_page = false;
|
||||
|
||||
foreach($pages as $page){
|
||||
|
||||
if(
|
||||
stripos(
|
||||
$page["attributes"]["class"],
|
||||
"active"
|
||||
) !== false
|
||||
){
|
||||
|
||||
$found_current_page = true;
|
||||
continue;
|
||||
}
|
||||
|
||||
if($found_current_page){
|
||||
|
||||
// we found current page index, and we iterated over
|
||||
// the next page <a>
|
||||
|
||||
$out["npt"] =
|
||||
$this->backend->store(
|
||||
parse_url(
|
||||
$page["attributes"]["href"],
|
||||
PHP_URL_QUERY
|
||||
),
|
||||
"web",
|
||||
$proxy
|
||||
);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return $out;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -701,11 +701,9 @@ class mojeek{
|
|||
if(count($thumb) === 2){
|
||||
|
||||
$answer["thumb"] =
|
||||
urldecode(
|
||||
$this->fuckhtml
|
||||
->getTextContent(
|
||||
$thumb[1]
|
||||
)
|
||||
$this->fuckhtml
|
||||
->getTextContent(
|
||||
$thumb[1]
|
||||
);
|
||||
}
|
||||
}
|
||||
|
|
|
@ -13,104 +13,31 @@ class pinterest{
|
|||
return [];
|
||||
}
|
||||
|
||||
private function get($proxy, $url, $get = [], &$cookies, $header_data_post = null){
|
||||
private function get($proxy, $url, $get = []){
|
||||
|
||||
$curlproc = curl_init();
|
||||
|
||||
if($header_data_post === null){
|
||||
|
||||
// handling GET
|
||||
|
||||
// extract cookies
|
||||
$cookies_tmp = [];
|
||||
curl_setopt($curlproc, CURLOPT_HEADERFUNCTION, function($curlproc, $header) use (&$cookies_tmp){
|
||||
|
||||
$length = strlen($header);
|
||||
|
||||
$header = explode(":", $header, 2);
|
||||
|
||||
if(trim(strtolower($header[0])) == "set-cookie"){
|
||||
|
||||
$cookie_tmp = explode("=", trim($header[1]), 2);
|
||||
|
||||
$cookies_tmp[trim($cookie_tmp[0])] =
|
||||
explode(";", $cookie_tmp[1], 2)[0];
|
||||
}
|
||||
|
||||
return $length;
|
||||
});
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
|
||||
["User-Agent: " . config::USER_AGENT,
|
||||
"Accept: application/json, text/javascript, */*, q=0.01",
|
||||
"Accept-Language: en-US,en;q=0.5",
|
||||
"Accept-Encoding: gzip",
|
||||
"Referer: https://ca.pinterest.com/",
|
||||
"X-Requested-With: XMLHttpRequest",
|
||||
"X-APP-VERSION: 78f8764",
|
||||
"X-Pinterest-AppState: active",
|
||||
"X-Pinterest-Source-Url: /",
|
||||
"X-Pinterest-PWS-Handler: www/index.js",
|
||||
"screen-dpr: 1",
|
||||
"is-preload-enabled: 1",
|
||||
"DNT: 1",
|
||||
"Sec-GPC: 1",
|
||||
"Sec-Fetch-Dest: empty",
|
||||
"Sec-Fetch-Mode: cors",
|
||||
"Sec-Fetch-Site: same-origin",
|
||||
"Connection: keep-alive",
|
||||
"Alt-Used: ca.pinterest.com",
|
||||
"Priority: u=0",
|
||||
"TE: trailers"]
|
||||
);
|
||||
|
||||
if($get !== []){
|
||||
$get = http_build_query($get);
|
||||
$url .= "?" . $get;
|
||||
}
|
||||
}else{
|
||||
|
||||
// handling POST (pagination)
|
||||
if($get !== []){
|
||||
$get = http_build_query($get);
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
|
||||
["User-Agent: " . config::USER_AGENT,
|
||||
"Accept: application/json, text/javascript, */*, q=0.01",
|
||||
"Accept-Language: en-US,en;q=0.5",
|
||||
"Accept-Encoding: gzip",
|
||||
"Content-Type: application/x-www-form-urlencoded",
|
||||
"Content-Length: " . strlen($get),
|
||||
"Referer: https://ca.pinterest.com/",
|
||||
"X-Requested-With: XMLHttpRequest",
|
||||
"X-APP-VERSION: 78f8764",
|
||||
"X-CSRFToken: " . $cookies["csrf"],
|
||||
"X-Pinterest-AppState: active",
|
||||
"X-Pinterest-Source-Url: /search/pins/?rs=ac&len=2&q=" . urlencode($header_data_post) . "&eq=" . urlencode($header_data_post),
|
||||
"X-Pinterest-PWS-Handler: www/search/[scope].js",
|
||||
"screen-dpr: 1",
|
||||
"is-preload-enabled: 1",
|
||||
"Origin: https://ca.pinterest.com",
|
||||
"DNT: 1",
|
||||
"Sec-GPC: 1",
|
||||
"Sec-Fetch-Dest: empty",
|
||||
"Sec-Fetch-Mode: cors",
|
||||
"Sec-Fetch-Site: same-origin",
|
||||
"Connection: keep-alive",
|
||||
"Alt-Used: ca.pinterest.com",
|
||||
"Cookie: " . $cookies["cookie"],
|
||||
"TE: trailers"]
|
||||
);
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_POST, true);
|
||||
curl_setopt($curlproc, CURLOPT_POSTFIELDS, $get);
|
||||
$url .= "?" . $get;
|
||||
}
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_URL, $url);
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_ENCODING, ""); // default encoding
|
||||
|
||||
// http2 bypass
|
||||
curl_setopt($curlproc, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_2_0);
|
||||
curl_setopt($curlproc, CURLOPT_HTTPHEADER,
|
||||
["User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:107.0) Gecko/20100101 Firefox/110.0",
|
||||
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
|
||||
"Accept-Language: en-US,en;q=0.5",
|
||||
"Accept-Encoding: gzip",
|
||||
"DNT: 1",
|
||||
"Connection: keep-alive",
|
||||
"Upgrade-Insecure-Requests: 1",
|
||||
"Sec-Fetch-Dest: document",
|
||||
"Sec-Fetch-Mode: navigate",
|
||||
"Sec-Fetch-Site: none",
|
||||
"Sec-Fetch-User: ?1"]
|
||||
);
|
||||
|
||||
curl_setopt($curlproc, CURLOPT_RETURNTRANSFER, true);
|
||||
curl_setopt($curlproc, CURLOPT_SSL_VERIFYHOST, 2);
|
||||
|
@ -127,26 +54,6 @@ class pinterest{
|
|||
throw new Exception(curl_error($curlproc));
|
||||
}
|
||||
|
||||
if($header_data_post === null){
|
||||
|
||||
if(!isset($cookies_tmp["csrftoken"])){
|
||||
|
||||
throw new Exception("Failed to grep CSRF token");
|
||||
}
|
||||
|
||||
$cookies = "";
|
||||
|
||||
foreach($cookies_tmp as $cookie_name => $cookie_value){
|
||||
|
||||
$cookies .= $cookie_name . "=" . $cookie_value . "; ";
|
||||
}
|
||||
|
||||
$cookies = [
|
||||
"csrf" => $cookies_tmp["csrftoken"],
|
||||
"cookie" => rtrim($cookies, " ;")
|
||||
];
|
||||
}
|
||||
|
||||
curl_close($curlproc);
|
||||
return $data;
|
||||
}
|
||||
|
@ -155,68 +62,17 @@ class pinterest{
|
|||
|
||||
if($get["npt"]){
|
||||
|
||||
[$data, $proxy] =
|
||||
$this->backend->get(
|
||||
$get["npt"], "images"
|
||||
);
|
||||
|
||||
$data = json_decode($data, true);
|
||||
|
||||
$search = $data["q"];
|
||||
$cookies = $data["cookies"];
|
||||
|
||||
try{
|
||||
$json =
|
||||
$this->get(
|
||||
$proxy,
|
||||
"https://ca.pinterest.com/resource/BaseSearchResource/get/",
|
||||
// @TODO
|
||||
// post data for next page
|
||||
$data = [
|
||||
"source_url" => "/search/pins/?q=" . urlencode($search) . "&rs=typed",
|
||||
"data" =>
|
||||
json_encode(
|
||||
[
|
||||
"source_url" => "/search/pins/?q=" . urlencode($search) . "&rs=typed",
|
||||
"data" => json_encode(
|
||||
[
|
||||
"options" => [
|
||||
"applied_unified_filters" => null,
|
||||
"appliedProductFilters" => "---",
|
||||
"article" => null,
|
||||
"auto_correction_disabled" => false,
|
||||
"corpus" => null,
|
||||
"customized_rerank_type" => null,
|
||||
"domains" => null,
|
||||
"dynamicPageSizeExpGroup" => null,
|
||||
"filters" => null,
|
||||
"journey_depth" => null,
|
||||
"page_size" => null,
|
||||
"price_max" => null,
|
||||
"price_min" => null,
|
||||
"query_pin_sigs" => null,
|
||||
"query" => $data["q"],
|
||||
"redux_normalize_feed" => true,
|
||||
"request_params" => null,
|
||||
"rs" => "typed",
|
||||
"scope" => "pins",
|
||||
"selected_one_bar_modules" => null,
|
||||
"source_id" => null,
|
||||
"source_module_id" => null,
|
||||
"source_url" => "/search/pins/?q=" . urlencode($search) . "&rs=typed",
|
||||
"top_pin_id" => null,
|
||||
"top_pin_ids" => null,
|
||||
"bookmarks" => [
|
||||
$data["bookmark"]
|
||||
]
|
||||
],
|
||||
"context" => []
|
||||
],
|
||||
JSON_UNESCAPED_SLASHES
|
||||
)
|
||||
],
|
||||
$cookies,
|
||||
$search
|
||||
// {"options":{"applied_filters":null,"appliedProductFilters":"---","article":null,"auto_correction_disabled":false,"corpus":null,"customized_rerank_type":null,"domains":null,"filters":null,"journey_depth":null,"page_size":null,"price_max":null,"price_min":null,"query_pin_sigs":null,"query":"higurashi","redux_normalize_feed":true,"rs":"typed","scope":"pins","selected_one_bar_modules":null,"source_id":null,"source_module_id":null,"top_pin_id":null,"bookmarks":["Y2JVSG81V2sxcmNHRlpWM1J5VFVad1ZsWlVRbXhpVmtreVZsZHpOV0pIU2tkV2FscFhVbXhhVkZreU1WSmtNREZWVjIxR1RrMXNTbEJXYlhSaFVtMVdjMVZ1U2xaaWEzQnpXVlJPVTJWV1pISlhhM1JYVm10V05sVldVbE5XVjBwMVVXMUdWVll6VFhoVWJYaFhWMVp3Ums1V1RsTmlSbGt5Vm10YWFtVkdWbkpOU0dSUFZsZG9XRmxzWkc5VlZscHlWbGhrYkdKR1NubFdWelZQWVVaYWRHVkVRbFppUmtwVVZrUktWMlJIVWtWV2JHaHBVakZLU0Zkc1pEUmtNVnBZVW10b2FsSXdXbkJXYlRWRFpHeGFSMWRzVG1oaGVrWllXV3RvVTFVeFpFaFZiRUpoVm5wRk1GbHFSbXRYVjA1R1YyczFWMVpHV2pSWFZtaDNVakZrY2sxWVRsaGlhM0JXV1ZSR1MyRkdiRlZTYm1SVVVteHdXbGxWVlRGVk1VbDVWRmhrVjAxdVVuWlVhMXBTWlVaT2MxcEhSbE5TTWswMVdtdGFWMU5YU2paVmJYaFRUVmhDUjFZeU5YZFVNVkY0VjJ0b1ZXRnJOVlpVVmxwTFVURndXR042VmxOV2ExcGFXVlZWTlZVeFNYZE5WRTVYVWtWYVZGWkhNVTlXTVU1WllVWk9hR1ZyV2s1WFZ6QXhZakpPVjFWWWFHRlNWbkJRVm14U1IwMUdXWGxOVkVKVlRWWnNORll5TURWV1YwVjVWV3hDV21FeGNETmFSVnByVjFkS1IyTkhhR2xYUjJkM1ZtdGFhMlF4VVhsVGJGcE9Wa1p3YjFwWGVFdFZWbFp4VW14YWJGWnRVbHBaTUdoTFZHMUtTR1ZJYUZkV2VrWjJWMVphU21ReVJYcGpSbFpwVW10d1RGZHJVa0pPVms1SFZHNVNUbFl3V2xoVmJYUldaVVpaZUZremFGUk5hM0JYVkZaYVYyRkZNSGxWYkVKYVlrWlZlRnBGV210WFIwNUpVMnMxVTFaR1dscFdWekI0VFVaV1IxTllaR3BUUlhCb1dWUkdWbVZHVm5SbFJuQnNZbFpKTWxSVlVYaFBSVGxGV1hwR1QyVnJSVEZVVlZKT1RrVXhSVkpVUWs5bGJFVXhWRmhzZDFOR1ZsWmtNMFp0VWpGYWIxZFhjRXBsUlRGSVZWaHdUbFl4YTNoVVZWSnFUVVUxV0ZadGFFOVNSVnB6Vkd0a1drMUdiRFpUVkVaT1pXMWplRmRzVWxkaFJuQllWVlJTVDJWdFRqWlVNVkpTWlZad2NWcEhkRTlsYTFwMFZGVlNhMkpWTVZWVFZFcE9Wa1pzTmxkWE1WSk9WVEYwVlcweFVGWXdXVFJXUjNSWFYwZGFRbEJVTVRoUFJHTXhUbnBCTlUxRVRUUk5SRVV3VG5wUk5VMTVjRWhWVlhkeFprUlZlRTlFVVRKWlZHc3lUMWRSTWsxVVVUSk9iVnBvV1RKWmVrNTZXWGhPTWs1cFQwUkZNVTlFVm1sTlZGcHBUV3BTYTFsWFRtcE9SR015VG1wVk5GbHFaR2haVjFacldWUmFiVmxxWkdoYVZGWnFUa1JXT0ZSclZsaG1RVDA5fFVIbzVhRkpYZUc1WFYyUlpWVEpHYkdGNk1XWk5ha1ptVFZSR09FOUVZekZPZWtFMVRVUk5ORTFFUlRCT2VsRTFUWGx3U0ZWVmQzRm1SMWw1VFZSUk1WbDZUVEJhUjFGNVQxZFNhVnB0VlRGT1JFVXdXVlJuZVU1cVRUUk5hbU40VDBSSk1VNXFWVEZOYlZwcVdsUnJlRTFFVVhwWmVsVjNXbXBvYkU1dFJYbE9ha0Y2VDFSSk5VMTZWVEJaYWtJNFZHdFdXR1pCUFQwPXxOb25lfDg3NTcwOTAzODAxNDc0OTMqR1FMKnwzMjM3YjM3ZGNhMGU3YjYyYzYzYzAyZGJkNGU1MjdlNzMyMTExMTNlMmUyMzEyOWM2MDAzYmU1ZTlmZjkwYjAwfE5FV3w="]},"context":{}}
|
||||
]
|
||||
);
|
||||
|
||||
}catch(Exception $error){
|
||||
|
||||
throw new Exception("Failed to fetch JSON");
|
||||
}
|
||||
];
|
||||
|
||||
}else{
|
||||
|
||||
|
@ -225,45 +81,27 @@ class pinterest{
|
|||
|
||||
throw new Exception("Search term is empty!");
|
||||
}
|
||||
|
||||
// https://ca.pinterest.com/resource/BaseSearchResource/get/?source_url=%2Fsearch%2Fpins%2F%3Feq%3Dhigurashi%26etslf%3D5966%26len%3D2%26q%3Dhigurashi%2520when%2520they%2520cry%26rs%3Dac&data=%7B%22options%22%3A%7B%22applied_unified_filters%22%3Anull%2C%22appliedProductFilters%22%3A%22---%22%2C%22article%22%3Anull%2C%22auto_correction_disabled%22%3Afalse%2C%22corpus%22%3Anull%2C%22customized_rerank_type%22%3Anull%2C%22domains%22%3Anull%2C%22dynamicPageSizeExpGroup%22%3Anull%2C%22filters%22%3Anull%2C%22journey_depth%22%3Anull%2C%22page_size%22%3Anull%2C%22price_max%22%3Anull%2C%22price_min%22%3Anull%2C%22query_pin_sigs%22%3Anull%2C%22query%22%3A%22higurashi%20when%20they%20cry%22%2C%22redux_normalize_feed%22%3Atrue%2C%22request_params%22%3Anull%2C%22rs%22%3A%22ac%22%2C%22scope%22%3A%22pins%22%2C%22selected_one_bar_modules%22%3Anull%2C%22source_id%22%3Anull%2C%22source_module_id%22%3Anull%2C%22source_url%22%3A%22%2Fsearch%2Fpins%2F%3Feq%3Dhigurashi%26etslf%3D5966%26len%3D2%26q%3Dhigurashi%2520when%2520they%2520cry%26rs%3Dac%22%2C%22top_pin_id%22%3Anull%2C%22top_pin_ids%22%3Anull%7D%2C%22context%22%3A%7B%7D%7D&_=1736116313987
|
||||
// source_url=%2Fsearch%2Fpins%2F%3Feq%3Dhigurashi%26etslf%3D5966%26len%3D2%26q%3Dhigurashi%2520when%2520they%2520cry%26rs%3Dac
|
||||
// &data=%7B%22options%22%3A%7B%22applied_unified_filters%22%3Anull%2C%22appliedProductFilters%22%3A%22---%22%2C%22article%22%3Anull%2C%22auto_correction_disabled%22%3Afalse%2C%22corpus%22%3Anull%2C%22customized_rerank_type%22%3Anull%2C%22domains%22%3Anull%2C%22dynamicPageSizeExpGroup%22%3Anull%2C%22filters%22%3Anull%2C%22journey_depth%22%3Anull%2C%22page_size%22%3Anull%2C%22price_max%22%3Anull%2C%22price_min%22%3Anull%2C%22query_pin_sigs%22%3Anull%2C%22query%22%3A%22higurashi%20when%20they%20cry%22%2C%22redux_normalize_feed%22%3Atrue%2C%22request_params%22%3Anull%2C%22rs%22%3A%22ac%22%2C%22scope%22%3A%22pins%22%2C%22selected_one_bar_modules%22%3Anull%2C%22source_id%22%3Anull%2C%22source_module_id%22%3Anull%2C%22source_url%22%3A%22%2Fsearch%2Fpins%2F%3Feq%3Dhigurashi%26etslf%3D5966%26len%3D2%26q%3Dhigurashi%2520when%2520they%2520cry%26rs%3Dac%22%2C%22top_pin_id%22%3Anull%2C%22top_pin_ids%22%3Anull%7D%2C%22context%22%3A%7B%7D%7D
|
||||
// &_=1736116313987
|
||||
|
||||
$source_url = "/search/pins/?q=" . urlencode($search) . "&rs=" . urlencode($search);
|
||||
|
||||
$filter = [
|
||||
"source_url" => $source_url,
|
||||
"source_url" => "/search/pins/?q=" . urlencode($search),
|
||||
"rs" => "typed",
|
||||
"data" =>
|
||||
json_encode(
|
||||
[
|
||||
"options" => [
|
||||
"applied_unified_filters" => null,
|
||||
"appliedProductFilters" => "---",
|
||||
"article" => null,
|
||||
"applied_filters" => null,
|
||||
"appliedProductFilters" => "---",
|
||||
"auto_correction_disabled" => false,
|
||||
"corpus" => null,
|
||||
"customized_rerank_type" => null,
|
||||
"domains" => null,
|
||||
"dynamicPageSizeExpGroup" => null,
|
||||
"filters" => null,
|
||||
"journey_depth" => null,
|
||||
"page_size" => null,
|
||||
"price_max" => null,
|
||||
"price_min" => null,
|
||||
"query_pin_sigs" => null,
|
||||
"query" => $search,
|
||||
"query_pin_sigs" => null,
|
||||
"redux_normalize_feed" => true,
|
||||
"request_params" => null,
|
||||
"rs" => "ac",
|
||||
"rs" => "typed",
|
||||
"scope" => "pins", // pins, boards, videos,
|
||||
"selected_one_bar_modules" => null,
|
||||
"source_id" => null,
|
||||
"source_module_id" => null,
|
||||
"source_url" => $source_url,
|
||||
"top_pin_id" => null,
|
||||
"top_pin_ids" => null
|
||||
"source_id" => null
|
||||
],
|
||||
"context" => []
|
||||
]
|
||||
|
@ -272,25 +110,23 @@ class pinterest{
|
|||
];
|
||||
|
||||
$proxy = $this->backend->get_ip();
|
||||
$cookies = [];
|
||||
|
||||
try{
|
||||
$json =
|
||||
$this->get(
|
||||
$proxy,
|
||||
"https://ca.pinterest.com/resource/BaseSearchResource/get/",
|
||||
$filter,
|
||||
$cookies,
|
||||
null
|
||||
);
|
||||
|
||||
}catch(Exception $error){
|
||||
|
||||
throw new Exception("Failed to fetch JSON");
|
||||
}
|
||||
}
|
||||
|
||||
$json = json_decode($json, true);
|
||||
try{
|
||||
$json =
|
||||
json_decode(
|
||||
$this->get(
|
||||
$proxy,
|
||||
"https://www.pinterest.ca/resource/BaseSearchResource/get/",
|
||||
$filter
|
||||
),
|
||||
true
|
||||
);
|
||||
|
||||
}catch(Exception $error){
|
||||
|
||||
throw new Exception("Failed to fetch JSON");
|
||||
}
|
||||
|
||||
if($json === null){
|
||||
|
||||
|
@ -303,60 +139,6 @@ class pinterest{
|
|||
"image" => []
|
||||
];
|
||||
|
||||
if(
|
||||
!isset(
|
||||
$json["resource_response"]
|
||||
["status"]
|
||||
)
|
||||
){
|
||||
|
||||
throw new Exception("Unknown API failure");
|
||||
}
|
||||
|
||||
if($json["resource_response"]["status"] != "success"){
|
||||
|
||||
$status = "Got non-OK response: " . $json["resource_response"]["status"];
|
||||
|
||||
if(
|
||||
isset(
|
||||
$json["resource_response"]["message"]
|
||||
)
|
||||
){
|
||||
|
||||
$status .= " - " . $json["resource_response"]["message"];
|
||||
}
|
||||
|
||||
throw new Exception($status);
|
||||
}
|
||||
|
||||
if(
|
||||
isset(
|
||||
$json["resource_response"]["sensitivity"]
|
||||
["notices"][0]["description"]["text"]
|
||||
)
|
||||
){
|
||||
|
||||
throw new Exception(
|
||||
"Pinterest returned a notice: " .
|
||||
$json["resource_response"]["sensitivity"]["notices"][0]["description"]["text"]
|
||||
);
|
||||
}
|
||||
|
||||
// get NPT
|
||||
if(isset($json["resource_response"]["bookmark"])){
|
||||
|
||||
$out["npt"] =
|
||||
$this->backend->store(
|
||||
json_encode([
|
||||
"q" => $search,
|
||||
"bookmark" => $json["resource_response"]["bookmark"],
|
||||
"cookies" => $cookies
|
||||
]),
|
||||
"images",
|
||||
$proxy
|
||||
);
|
||||
}
|
||||
|
||||
foreach(
|
||||
$json
|
||||
["resource_response"]
|
||||
|
@ -368,7 +150,6 @@ class pinterest{
|
|||
switch($item["type"]){
|
||||
|
||||
case "pin":
|
||||
case "board":
|
||||
|
||||
/*
|
||||
Handle image object
|
||||
|
@ -425,15 +206,42 @@ class pinterest{
|
|||
"height" => (int)$thumb["height"]
|
||||
]
|
||||
],
|
||||
"url" =>
|
||||
$item["link"] === null ?
|
||||
"https://ca.pinterest.com/pin/" . $item["id"] :
|
||||
$item["link"]
|
||||
"url" => "https://www.pinterest.com/pin/" . $item["id"]
|
||||
];
|
||||
break;
|
||||
|
||||
case "board":
|
||||
if(isset($item["cover_pin"]["image_url"])){
|
||||
|
||||
$image = [
|
||||
"url" => $item["cover_pin"]["image_url"],
|
||||
"width" => (int)$item["cover_pin"]["size"][0],
|
||||
"height" => (int)$item["cover_pin"]["size"][1]
|
||||
];
|
||||
}elseif(isset($item["image_cover_url_hd"])){
|
||||
/*
|
||||
$image = [
|
||||
"url" =>
|
||||
"width" => null,
|
||||
"height" => null
|
||||
];*/
|
||||
}
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return $out;
|
||||
}
|
||||
|
||||
private function getfullresimage($image, $has_og){
|
||||
|
||||
$has_og = $has_og ? "1200x" : "originals";
|
||||
|
||||
return
|
||||
preg_replace(
|
||||
'/https:\/\/i\.pinimg\.com\/[^\/]+\//',
|
||||
"https://i.pinimg.com/" . $has_og . "/",
|
||||
$image
|
||||
);
|
||||
}
|
||||
}
|
||||
|
|
|
@ -1209,16 +1209,15 @@ class yt{
|
|||
|
||||
$reel =
|
||||
$reel
|
||||
->shortsLockupViewModel;
|
||||
->reelItemRenderer;
|
||||
|
||||
array_push(
|
||||
$this->out["reel"],
|
||||
[
|
||||
"title" =>
|
||||
$reel
|
||||
->overlayMetadata
|
||||
->primaryText
|
||||
->content,
|
||||
->headline
|
||||
->simpleText,
|
||||
"description" => null,
|
||||
"author" => [
|
||||
"name" => null,
|
||||
|
@ -1226,22 +1225,30 @@ class yt{
|
|||
"avatar" => null
|
||||
],
|
||||
"date" => null,
|
||||
"duration" => null,
|
||||
"views" => null,
|
||||
"duration" =>
|
||||
$this->textualtime2int(
|
||||
$reel
|
||||
->accessibility
|
||||
->accessibilityData
|
||||
->label
|
||||
),
|
||||
"views" =>
|
||||
$this->truncatedcount2int(
|
||||
$reel
|
||||
->viewCountText
|
||||
->simpleText
|
||||
),
|
||||
"thumb" => [
|
||||
"url" =>
|
||||
$reel
|
||||
->thumbnail
|
||||
->sources[0]
|
||||
->thumbnails[0]
|
||||
->url,
|
||||
"ratio" => "9:16"
|
||||
],
|
||||
"url" =>
|
||||
"https://www.youtube.com/watch?v=" .
|
||||
$reel
|
||||
->onTap
|
||||
->innertubeCommand
|
||||
->reelWatchEndpoint
|
||||
->videoId
|
||||
]
|
||||
);
|
||||
|
|
12
settings.php
12
settings.php
|
@ -133,10 +133,6 @@ $settings = [
|
|||
"value" => "google",
|
||||
"text" => "Google"
|
||||
],
|
||||
[
|
||||
"value" => "google_cse",
|
||||
"text" => "Google CSE"
|
||||
],
|
||||
[
|
||||
"value" => "startpage",
|
||||
"text" => "Startpage"
|
||||
|
@ -207,10 +203,6 @@ $settings = [
|
|||
"value" => "google",
|
||||
"text" => "Google"
|
||||
],
|
||||
[
|
||||
"value" => "google_cse",
|
||||
"text" => "Google CSE"
|
||||
],
|
||||
[
|
||||
"value" => "startpage",
|
||||
"text" => "Startpage"
|
||||
|
@ -227,10 +219,10 @@ $settings = [
|
|||
"value" => "solofield",
|
||||
"text" => "Solofield"
|
||||
],
|
||||
[
|
||||
/*[
|
||||
"value" => "pinterest",
|
||||
"text" => "Pinterest"
|
||||
],
|
||||
],*/
|
||||
[
|
||||
"value" => "imgur",
|
||||
"text" => "Imgur"
|
||||
|
|
|
@ -16,7 +16,6 @@
|
|||
|
||||
body{
|
||||
padding:15px 4% 40px;
|
||||
margin:unset;
|
||||
}
|
||||
|
||||
h1,h2,h3,h4,h5,h6{
|
||||
|
|
|
@ -1,40 +0,0 @@
|
|||
:root
|
||||
{
|
||||
--accent : #f79e98;
|
||||
--1d2021 : #180d0c;
|
||||
--282828 : #180d0c;
|
||||
--3c3836 : #251615;
|
||||
--504945 : #251615;
|
||||
--928374 : var(--accent);
|
||||
--a89984 : #d8c5c4;
|
||||
--bdae93 : #d8c5c4;
|
||||
--8ec07c : var(--accent);
|
||||
--ebdbb2 : #d8c5c4;
|
||||
--comment: #928374;
|
||||
--default: #DCC9BC;
|
||||
--keyword: #F07342;
|
||||
--string : var(--accent);
|
||||
--green : #959A6B;
|
||||
--yellow : #E39C45;
|
||||
--red : #CF223E;
|
||||
--white : var(--a89984);
|
||||
--black : var(--1d2021);
|
||||
--hover : #b18884
|
||||
}
|
||||
|
||||
a.link, a { color: var(--accent); text-decoration: none; }
|
||||
.searchbox { width: 23%; }
|
||||
.filters filter select { color: #E39C45; }
|
||||
.web .separator::before { color: var(--white) }
|
||||
.searchbox input[type="text"]::placeholder { color: var(--white); }
|
||||
a.link:hover
|
||||
{
|
||||
color: var(--hover);
|
||||
text-shadow: 0 0 .2rem var(--hover);
|
||||
}
|
||||
.code-inline
|
||||
{ border-color: var(--default); font-family: monospace;}
|
||||
.home #center a
|
||||
{ color: var(--accent); }
|
||||
.home .subtext
|
||||
{ color: var(--white); }
|
Loading…
Reference in New Issue